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Preface 


Purpose of this Book 


This book is the second in a pair of books which together are intended to bring 
the reader through classical differential geometry into the modern formulation of 
the differential geometry of manifolds. The first book in the pair, by Banchoff and 
Lovett, entitled Differential Geometry of Curves and Surfaces [6], introduces the 
classical theory of curves and surfaces, only assuming the calculus sequence and 
linear algebra. This book continues the development of differential geometry by 
studying manifolds — the natural generalization of regular curves and surfaces to 
higher dimensions. Though a background course in analysis is useful for this book, 
we have provided all the necessary analysis results in the text. Though [6] provides 
many examples of one- and two-dimensional manifolds that lend themselves well to 
visualization, this book does not rely on [6] and can be read independently. 

Taken on its own, this book provides an introduction to differentiable manifolds, 
geared toward advanced undergraduate or beginning graduate readers in mathemat- 
ics, retaining a view toward applications in physics. For readers primarily interested 
in physics, this book may fill a gap between the geometry typically offered in under- 
graduate programs and that expected in physics graduate programs. For example, 
some graduate programs in physics first introduce electromagnetism in the context 
of a manifold. The student who is unaccustomed to the formalism of manifolds 
may be lost in the notation at worst or, at best, be unaware of how to do explicit 
calculations on manifolds. 


What is Differential Geometry? 


Differential geometry studies properties of and analysis on curves, surfaces, and 
higher dimensional spaces using tools from calculus and linear algebra. Just as the 
introduction of calculus expands the descriptive and predictive abilities of nearly 
every scientific field, so the use of calculus in geometry brings about avenues of 
inquiry that extend far beyond classical geometry. 

Though differential geometry does not possess the same restrictions as Euclidean 
geometry on what types of objects it studies, not every conceivable set of points 
falls within the purview of differential geometry. One of the underlying themes 


vil 


Vill 
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of this book is the development and description of the types of geometric sets on 
which it is possible to “do calculus.” This leads to the definition of differentiable 
manifolds. A second, and somewhat obvious, theme is how to actually do calculus 
(measure rates of change of functions or interdependent variables) on manifolds. A 
third general theme is how to “do geometry” (measure distances, areas and angles) 
on such geometric objects. This theme leads us to the notion of a Riemannian 
manifold. 

Applications of differential geometry outside of mathematics first arise in me- 
chanics in the study of the dynamics of a moving particle or system of particles. 
The study of inertial frames is in common to both physics and differential geome- 
try. Most importantly, however, differential geometry is necessary to study physical 
systems that involve functions on curved spaces. For example, just to make sense of 
directional derivatives of the surface temperature at a point on the earth (a sphere) 
requires analysis on manifolds. The study of mechanics and electromagnetism on 
a curved surface also requires analysis on a manifold. Finally, arguably the most 
revolutionary application of differential geometry to physics came from Einstein’s 
theory of general relativity, in which spacetime becomes curved in the presence of 
mass /energy. 


Organization of Topics 


A typical calculus sequence analyzes one variable real functions (R > R), paramet- 
ric curves (R + R”), multivariable functions (R” — R) and vector fields (R? > R? 
or R? — R*). This does not quite reach the full generality necessary for the defi- 
nition of manifolds. Chapter 1 presents the analysis of functions f : R" — R™ for 
any positive integers n and m. 

Chapter 2 discusses the concept and calculus of variable frames. Variable frames 
arise naturally when using curvilinear coordinates, in the differential geometry of 
curves (see Chapters 1, 3, and 8 of [5]), and, in physics, in the mechanics of a mov- 
ing particle. In special relativity, of critical importance are momentarily comoving 
reference frames (MCRFs), which are yet other examples of variable frames. Im- 
plicit in our treatment of variable frames is a view toward Lie algebras. However, 
to retain the chosen level of this book, we do not develop that theory here. 

Chapter 3 defines the category of differentiable manifolds. Manifolds serve as the 
appropriate and most complete generalization to higher dimensions of regular curves 
and regular surfaces. The chapter also introduces the definition for the tangent space 
on a manifold and attempts to provide the underlying intuition behind the abstract 
definitions. 

Before jumping into the analysis on manifolds, Chapter 4 introduces some neces- 
sary background in multilinear algebra. We focus on bilinear forms, dual spaces, au- 
tomorphisms of nondegenerate bilinear forms, and tensor products of vector spaces. 

Chapter 5 then develops the analysis on differentiable manifolds, including the 
differentials of functions between manifolds, vector fields, differential forms, and 
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integration. 


Chapter 6 introduces Riemannian geometry without any pretention of being 
comprehensive. One can easily take an entire course on Riemannian geometry, the 
proper context in which one can do both calculus and geometry on a curved space. 
The chapter introduces the notions of metrics, connections, geodesics, parallel trans- 
port and the curvature tensor. 


Having developed the technical machinery of manifolds, in Chapter 7 we apply 
our the theory to a few areas in physics. We consider the Hamiltonian formulation 
of dynamics, with a view toward symplectic manifolds; the tensorial formulation of 
electromagnetism; a few geometric concepts involved in string theory, namely the 
properties of the world sheet which describes a string moving in a Minkowski space; 
and some fundamental concepts in general relativity. 


In order to be rigorous and still only require the standard core in most under- 
graduate math programs, three appendices provide any necessary background from 
topology, calculus of variations, and a few additional results from multilinear alge- 
bra. The reader without any background in analysis would be served by consulting 
Appendix A on point set topology before Chapter 3. 


A Comment on Using the Book 


Because of the intended purpose of the book, it can serve well either as a textbook 
or for self-study. The conversational style attempts to introduce new concepts in 
an intuitive way, explaining why we formulate certain definitions as we do. As a 
mathematics text, this book provides proofs or references for all theorems. On the 
other hand, this book does not supply all the physical theory and discussion behind 
the all the application topics we broach. 


Each section concludes with an ample collection of exercises. Problems marked 
with (*) indicate difficulty which may be related to technical ability, insight, or 
length. 


As mentioned above, this book only assumes prior knowledge of multivariable 
calculus and linear algebra. A few key results presented in this textbook rely on 
theorems from the theory of differential equations but either the calculations are all 
spelled out or a reference to the appropriate theorem has been provided. Therefore, 
except in the case of exercises about geodesics, experience with differential equations 
is helpful though not necessary. 


From the perspective of a faculty person using this as a course textbook, the 
author intends every section to correspond to one 60-minute lecture period. With 
the assumption of a 16-week semester, a course using this book should find the time 
to cover all main sections and the appendices on topology. If a faculty knows that 
his or her students have enough analysis or topology, Chapter 1 or Appendix A can 
be skipped. 
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Notation 


It has been said jokingly that “differential geometry is the study of things that are 
invariant under a change of notation.” A quick perusal of the literature on differ- 
ential geometry shows that mathematicians and physicists usually present topics in 
this field in a variety of different ways. One could argue that notational differences 
have contributed to a communication gap between mathematicians and physicists. 
In addition, the classical and modern formulations of many differential geometric 
concepts vary significantly. Whenever different notations or modes of presentation 
exist for a topic (e.g. differentials, metric tensor, tensor fields), this book attempts 
to provide an explicit coordination between the notation variances. 

As a comment on vector and tensor notation, this book consistently uses the 
following conventions. A vector or vector function in a Euclidean vector space is 
denoted by @, X(t) or X(u,v). Vectors in an arbitrary vector space, curves on 
manifolds, tangent vectors to a manifold, vector fields or tensor fields have no over- 
right-arrow designation and are written, for example, as v, y, X or T. A fair 
number of physics texts use a bold font like g or A to indicate tensors or tensor 
fields. Therefore, when discussing tensors taken from a physics context, we also use 
that notation. 

Different texts also employ a variety of notations to express the coordinates of a 
vector with respect to a given basis. In this textbook, we regularly use the following 
notation. If V is a vector space with an ordered basis B = (e1,€2,...,€n), then the 
coordinates of a vector v € V with respect to B are denoted by [v]g. More precisely, 


U1 
v2 
lub=|. if and only if v =v ye, + v9eg +--+ + Unen- 


Un 


As a point of precision, when discussing coordinates we must use an ordered basis 
since the order of vectors in the n-tuple matters for associating the correct coordi- 
nate. 

Beginning in Chapter 2, we switch from this typical notation to writing the in- 
dices of coordinates in a superscript. So we will refer to the coordinates of v € V 
with respect to B as (v’). This switch in notation from that developed in intro- 
ductory linear algebra courses is standard in differential geometry and multilinear 
algebra. The reason for this switch is explained fully in Section 4.1. In this context, 
the superscript is not a power but an index. This modified notation is particularly 
useful to recognize the difference between a (contravariant) vector and a dual vector 
(also called covector) and then to use Einstein’s summation convention. This new 
notation is standard in differential geometry, including applications in physics. 

For linear transformations and their associated matrices, this book uses the 
following convention. Suppose also that W is a vector space with a basis B’ and 


that T is a linear transformation T : V + W. Then we denote by [T]§ the matrix 
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representing T with respect to the basis B on V and the basis B’ on W. We recall 
that this matrix is defined as the matrix such that 


, 


[Te = (718 (ele 


for allu € V. 

The authors of [6] chose the following notations for certain specific objects of 
interest in differential geometry of curves and surfaces. Often y indicates a curve 
parametrized by X(t) while writing X(t) = X(u(t),v(t)) indicates a curve on a 
surface. The unit tangent and the binormal vectors of a curve in space are written 
in the standard notation T(t) and B(t) but the principal normal is written P(t), 
reserving N (t) to refer to the unit normal vector to a curve on a surface. For a 
plane curve, U(t) is the vector obtained by rotating T(t) by a positive angle of 7/2. 
Furthermore, we denote by «,(t) the curvature of a plane curve to identify it as the 
geodesic curvature of a curve on a surface. When these concepts occur in this text, 
we use the same conventions as [6]. 

Occasionally, there arise irreconcilable discrepancies in habits of notation, e.g., 
how to place the signs on a Minkowski metric, how one defines 6 and ¢ in spherical 
coordinates, what units to use in electromagnetism, etc. In these instances the text 
makes a choice that best suits its purpose and philosophical leanings, and indicates 
commonly used alternatives. 


Changes in the Second Edition 


The second edition of this text arose from feedback from students and faculty using 
this book and the author seeing room for improvement of his personal experience 
teaching from it. 

As a first major change to benefit faculty using this book, the second edition 
commits that each section should correspond to one 60-minute lecture period. Con- 
sequently, some of the sections in the first edition were split in two. Part of the 
reorganization required the creation of a few new sections to cover topics, which 
the author felt had been too compressed in the first edition, e.g., orientability of 
manifolds, the Lie derivative of vector fields, applications of integration. 

The centrality of multilinear algebra in this text’s approach encouraged us to 
take that content out of the appendices in the first edition to become Chapter 4 
in the current edition. This may feel like an interlude between Chapter 3, which 
defines manifolds and differentiable maps between them, and Chapter 5, which stud- 
ies the analysis on manifolds. Nonetheless, hopefully the location on this content 
makes sense since it first becomes necessary in Chapter 5. Having a regular chapter 
on multilinear algebra allows for a more natural introduction to tensors and the 
notation for tensor component notation. 

Woven throughout, the second edition attempts to improve the presentation 
style and better foreshadow certain topics. For example, Equation (2.11) about 
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how to decompose the partial derivatives in a frame of vector fields augurs the 
definition of a connection on a manifold. 

Most of the exercises remained the same, though we improved the statements of 
some and modified the challenge level of the computations for others. In addition, 
we added a few new interesting problems. 
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CHAPTER 1 


Analysis of Multivariable Functions 


Manifolds provide a generalization to the concept of a curve or a surface, objects 
introduced in the usual calculus sequence. Parametrized curves into R” are con- 
tinuous functions from an interval of R to R”; parametrized surfaces in R? involve 
continuous functions from R? to R%. In order to generalize the study of curves and 
surfaces to the theory of manifolds, we need a solid foundation in the analysis of 
multivariable functions f : R” — R™. 


1.1 Functions from R"” to R™ 


Let U be a subset of R” and let f : U — R™ be a function from U to R™. Writing 
the input variable as 


H = (%1,X9,.-..,Ln), 


we denote the output assigned to % by f(Z) or f(x1,...,%p). Since the codomain 
of f is R™, the images of f are m-tuples so we can write 


f(@) = (fi(#), fo(@),---, fm(&)) 


= Flt Mss indy Be )y Polis B95 « ies oN aeees CM Cee ote pai) ‘ 


The functions f; : U — R, for i = 1,2,...,m, are called the component functions 
of f. 

We sometimes use the notation f( (Z) to emphasize the fact that the codomain 
R”™ is a vector space and that any operation on m-dimensional vectors is permitted 
on functions f : R” — R™. Therefore, some authors call such functions vector 
functions of a vector variable. 


In any Euclidean space R”, the standard basis is the set of vectors written as 
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{é1, €, seey En}, where 


with the only nonzero entry 1 occurring in the 7th coordinate. If no basis is explicitly 
specified for R”, then it is assumed that one uses the standard basis. 

At this point, a remark is in order concerning the differences in notations between 
calculus and linear algebra. In calculus, one usually denotes an element of R” as an 
n-tuple and writes this element on one line as (41, 22,...,2n). On the other hand, 
in order to reconcile vector notation with the usual manner we multiply a matrix 
by a vector, in linear algebra we denote an element of R” as a column vector 


Ty 
x2 


In 


At first pass, we might consider these differences of notation as an unfortunate result 
of history. However, the difference between column vectors and row vectors is not a 
mere variance of notation: one represents the coordinates of an element in a vector 
space V with respect to some basis, while the other represents the coordinates of an 
element in the dual vector space V*, a concept which we develop later. In the rest 
of this book, we will write the components of a vector function on one line as per 
the n-tuple notation, but whenever a vector or vector function appears in a linear 
algebraic context, we write it as a column vector. 

In the typical calculus sequence, we encounter vector functions or vector-valued 
functions in the following contexts. 


Example 1.1.1 (Curves in R”). A parametrized curve into n-dimensional space is a 
continuous function # : J + R”, where J is some interval of R. Parametrized curves 
are vector functions of a single variable. We can view the independent variable as 
coming from a one-dimensional real vector space. 


Example 1.1.2 (Nonlinear Coordinate Changes). A general change of coordinates 
in R? is a function F : U — R?, where U is the subset of R? in which the coor- 
dinates are defined. For example, the change from polar coordinates to Cartesian 
coordinates is given by the function F : R? + R? defined by 


F(r,0) = (rcos6,rsin@). 


Example 1.1.3. In a multivariable calculus, we encounter functions F': R” > R, 
written as F'(x1,22,...,2n). All such functions are just examples of vector functions 
of a vector variable with a codomain of R. 


1.1. Functions from R” to R™ 


Example 1.1.4. As an example of a function from R? to R°, consider the function 


2v(1 — u?) 4uv — 


Fu) =(Gaajaae) Cee Tee 


Notice that the component functions satisfy 
4y?(1 — u?)? + 16u?v? + (1 + u?)?(1 — 0)? 
ee ela 
4y?(1 + u?)? + (1 +.u?)?(1 — v?)? 

(1 + u?)2(1 + v?)? 
Cpullaeary 


~(erueee 


Fo 4 FP + FR = 


Thus, the image of F lies on the unit sphere S* = {(z, y, z) € R?|a?+y*+2? = 1}. 
Note that F does not surject onto S?. Assuming a? + y? + 22 = 1, if F(u,v) = 
(x,y,z), then in particular 


1—v? 1l-z 
Z= a 
1+? . Vi+z 


which implies that —1 < z < 1, and hence, the point (0,0,—1) is not in the range 
of F. Furthermore, since 


D) 2 
e+: ( 7 ) =1, and thus 
14+? 


2u ——> 
$y ee 


for any fixed z, we have 


eae 
e=[4vi-# and y ml V1l—-2?. 


~ T+? 


But then, if y = 0, it is impossible to obtain x = —V1— z?. Consequently, the 
image of F is 


F(R’) =S* - {(z,y,z) €S’ |e =-V1- 2 with z < 1}. 
Figure 1.1 shows the image of F over the rectangle (#1, 72) € [—2,5] x [0.5, 5}. 


There are a few different ways to visualize functions, particularly when n and 
m are less than or equal to 3. Recall that the graph of a function f : R” — R” is 
the subset of R” x R™ = R"t™ defined by 


TG nantes Wiguiccsties ETRE | (Vicari PH eines tn) s 


We can visualize this explicitly when m+n < 3 with a three dimensional graphic. 
When m = 1, we recover the usual method to depict functions f : R — R and 
f:R? >R. 
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Figure 1.1: Portion of the image for Example 1.1.4. 


For functions F : R? — R (respectively F : R? — R), another way to attempt to 
visualize F' is by plotting together (or in succession if one has dynamical graphing 
capabilities) a collection of level curves (respectively surfaces) defined by F(x, y) = 
c; (respectively F(x, y, z) = c;) for a discrete set of values c;. This is typically called 
a contour diagram of F. Figure 1.2 depicts a contour diagram of 2y/(x? + y? +1) 
with c = 0, 40.2, +0.4, +0.6, +0.8. 

In multivariable calculus or in a basic differential geometry course ([5]), one 
typically uses yet another technique to visualize functions of the form P >R—- R™, 
for m = 2 or 3. By plotting the points that consist of the image of a we see a plane 
or space curve. In doing so, we lose visual information about how fast one travels 
along the curve. Figure 1.3 shows the image of the so-called space cardioid, given 
by the function 


f(t) = ((1— cost) cost, (1 — cost) sint, sin t). 


Similarly, in the study of surfaces, it is common to depict a function F:R?2 = R3 
by plotting its image in R°. (The graph of a function of the form R? > R? is a 
subset of R°, which is quite difficult to visualize no matter what computer tools one 
has at one’s disposal!) 

We define the usual operations on functions as expected. 


Definition 1.1.5. Let f and g be two functions defined over a subset U of R” with 
codomain R™. Then we define the following functions: 
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Figure 1.2: A contour diagram. Figure 1.3: A space curve. 


=> 


2. (f- 9) :U +R, where (f- 9)(#) = f(@) - G(@). 
3. If m =3, (f x 9): U > RB’, where (f x 9)(#) = f(#) x G(@). 


Definition 1.1.6. Let f be a function from a subset U C R” to R™, and let g 
be a function from V C R™ to R*. If the image of f is a subset of V, then the 


> 


composition function go f is the function U + R* defined by 
(Go f)(@) = a(F@). 


Out of the vast variety of possible functions one could study, the class of linear 
functions serves a fundamental role in the analysis of multivariable functions. We 
remind the reader of various properties of linear functions. 


Definition 1.1.7. A function F : R” > R”™ is called a linear function if 


F(€+9) =F(@) + FV) for all 7,7 € R”, 
F(k&) = kF(2) for all k € R and all ZE R”. 


If a function F' : R” > R” is linear, then 


F(0) = F(0 — 0) = F(0) — F(0) =, 


and hence F' maps the origin of R” to the origin of R™. 
If B = {fi, fo,-.-,; fn} is a basis of R”, then any vector v € R” can be written 
uniquely as a linear combination of vectors in B as 


B=cifi t+ cofot--:+enfn- 
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One often writes the coefficients in linear algebra as the column vector 


If the basis GB is not specified, one assumes that the coefficients are given in terms 
of the standard basis. If F is a linear function, then 


F(v) =cF(fi) +--+: +enF (fn); 


hence, to know all outputs of F one needs to know the coefficients of [v]g and the 
output of the basis vectors of B. Suppose also that B’ = {ww,, We,..., Wm} is a basis 
of R™. If the B’-coordinates of the outputs of the vectors in B are 


ai1 a12 Gin 

2 a21 > a22 ~ a2n 
LF( fi) |e = : ’ [F'(f2) |B: = . ry ry [F' (fr) |B. — ’ 

aml aAm2 Amn 


then the image of the vector @ € R” is given by 


Q11 a12 Qin Q11 a12 Qin Cy 
a21 a22 a2n a21 a22 a2n C2 
+ C2 . +++ Cn = : , . , 
Am1 Am2 Amn Am1 aAm2 Amn Cn 
The matrix 
Q11 a12 Qin 
a21 a22 a2n 
A=]. ‘ 
aml Am2 Amn 


is called the B, B’-matrix representing the linear function F and is denoted by [F]%,. 
Therefore, 
[F@) |e = [FB [ee 

for all v € R”. 

Given a linear function F' : R” — R™, one calls the image of F the set Im F = 
F(R"), also called the range. The kernel of F is the zero set 

ker F = {ii € R" | F(a) = 0}. 

The image Im F' is a vector subspace of the codomain R™ and the kernel is a 
subspace of the domain R”. The rank of F is the dimension dim(Im F’) and can 


1.1. Functions from R” to R™ 


be shown to be equal to the size of the largest nonvanishing minor of any matrix 
representing F’, which is independent of the bases. The image of F’ cannot have a 
greater dimension than either the domain or the codomain, so 


rank F < min{m,n}, 


and one says that F’ has maximal rank if rank F = min{m,n}. It is not hard to 
show that a linear function F : R" — R™ is surjective if and only if rank F = m 
and F is injective if and only if rank F = n. 

The rank is also useful in determining the linear dependence between a set of 
vectors. If {t1, to,...,t%,} is a set of vectors in R™, then the matrix 


A=(t t +--+ th), 
where the w; are viewed as column vectors, represents a linear function F’ : R" > 
R™, with 
Im F = Span{t1, te,..., tn}. 
Thus, the set of vectors {t1,t2,...,U,} is linearly independent if and only if 
rank F =n. 

In the case of n = m, the determinant provides an alternative characterization 
to linear independence. If F’ is a linear function from R” to itself with associated 
matrix A, then | det A] is the n-volume of the image under F' of the unit n-cube. 
Consequently, if the columns of A are not linearly independent, the n-volume of 


this parallelopiped will be 0. This leads one to a fundamental summary theorem in 
linear algebra. 


Theorem 1.1.8. For a linear function F : RR" + R” with associated square matriz 
A, the following statements are equivalent: 


1. rank F =n. 

det AF 0. 

ImF =R”. 

ker F = {0}. 

The column vectors of A are linearly independent. 


The column vectors of A form a basis of R”. 


ROA © WS 


The column vectors of A span R”. 


So 


. F has an inverse function. 


We remind the reader that matrix multiplication is defined in such a way so 
that if A is the matrix for a linear function F : R” > R” and B is the matrix for 
a linear function G : R? > R”, then the product AB is the matrix representing the 
composition F'o G: R? + R™. In other words, 


[Fo Gl¢ = [F]e[Gle, 
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where A, 6 and C are bases on R?, R”, and R™ respectively. Furthermore, if 
m =n and rankF = n, then the matrix A~! is the matrix that represents the 
inverse function of F’. 

A particularly important case of matrices representing linear transformations is 
the change of basis matriz. Let B and B’ be two bases on R”. The change of basis 
matrix from B to B’ coordinates is M = [id|,, where id: R" + R” is the identity 
transformation. In other words, for all 0 € R”, 


[ls = M{ils. 


1B ={fi, fa,-., fr}, then M = (file [fle + (fale): 


PROBLEMS 


1.1.1. Consider the function F in Example 1.1.4. Prove algebraically that if the domain 
is restricted to R x (0,-+00), it is injective. What is the image of F' in this case? 


1.1.2. Let F : R*? — R? be the function defined by F(s,t) = (s? — t?, 2st), and let 
G : R? > R? be the function defined by G(u, v) = (2u? — 3v, wv + v?). Calculate 
the component functions of Fo G and of Go F. 


1.1.3. Show that the function X : [0,27] x [0,7] > R°, with 


a 


X (x1, £2) = (cos #1 sin x2, sin 771 sin £2, cos £2), 


defines a mapping onto the unit sphere in R®. Which points on the unit sphere 
have more than one preimage? 


1.1.4. Consider the function F from R°* to itself defined by 
F(a, 2, x3) = (a1 + 2x02 “+ 323,401 + 5x2 + 623, 7x1 + 8x2 + 9x3). 


Prove that this is a linear function. Find the matrix associated to F' (with respect 
to the standard basis). Find the rank of F’, and if the rank is less than 3, find 
equations for the image of F’. 


1.1.5. Consider a line L in R” traced out by the parametric equation £(t) = td + b. 
Prove that for any linear function F’ : R” > R™, the image F'(L) is either a line 
or a point. 


1.1.6. Let fF: R” — R”™ be a linear function, and let Li; and Lz be parallel lines in R”. 
Prove that F'(L1) and F(L2) are either both points or both lines in R™. If F(L1) 
and F'(L2) are both lines, prove that they are parallel. 


1.1.7. Let F : R” > R”™ be a linear function represented by a matrix A with respect 
to a basis B on R” and a basis B’ on R™. Prove that F maps every pair of 
perpendicular lines in R” to another pair of perpendicular lines in R™ if and only 
if A’ A = XI, for some nonzero real number 4. 


1.1.8. Let @ be a nonzero vector in R”. Define the function F': R"” > R as 
F(@)=0-2. 


Prove that F is a linear function. Find the matrix associated to F’ (with respect 
to the standard basis). 


1.2. Continuity, Limits, and Differentiability 


1.1.9. Let @ be a nonzero vector in R®. Define the function F : R® > R? as 
F(@) =x &z. 


Prove that F is a linear function. Find the matrix associated to F (with respect 
to the standard basis). Prove that rank F = 2. 


1.2 Continuity, Limits, and Differentiability 


Intuitively, a function is called continuous if it preserves “nearness.” A rigorous 
mathematical definition for continuity for functions from R” to R™ is hardly any 
different for functions from R > R. 

In calculus of a real variable, one does not study functions defined over a discrete 
set of real values because the notions behind continuity and differentiability do not 
make sense over such sets. Instead, one often assumes the function is defined over 
some interval. Similarly, for the analysis of functions R” to R™, one does not study 
functions defined from any subset of R” into R™. One typically considers functions 
defined over what is called an open set in R”, a notion we define now. 


Definition 1.2.1. The open ball around #p of radius r is the set 
B,(%) = {# € R” : ||Z- Zl| < r}. 


A subset U C R” is called open if for all ¢ € U there exists an r > 0 such that 
B,(#) CU. 


Intuitively speaking, the definition of an open set U in R” implies that at every 
point p € U it is possible to “move” in any direction by at least a little amount € and 
still remain in U. This means that in some sense U captures the full dimensionality 
of the ambient space R”. This is why, when studying the analysis of functions from 
R” to R™, we narrow our attention to functions F' : U — R™, where U is an open 
subset of R”. 

The reader is encouraged to consult Subsection A.1.2 in Appendix A for more 
background on open and closed sets. The situation in which we need to consider 
an open set U and a point zp in U is so common that another terminology exists 
for U in this case. 


Definition 1.2.2. Let z € R”. Any open set U in R” such that Zp € U is called 
an open neighborhood, or more simply, a neighborhood, of Zp. 


We are now in a position to formally define continuity. 


Definition 1.2.3. Let U be an open subset of R”, and let F' be a function from U 
into R™. The function F is called continuous at the point 2% € U if F'(%o) exists 
and if, for all ¢ > 0, there exists a 6 > 0 such that for all Z ER, 


||z — Zol| < 6 = ||F(@) — F@)Il <«. 


The function F' is called continuous on U if it is continuous at every point of U. 
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With the language of open balls, one can rephrase the definition of continuity 
as follows. Let U be an open subset of R”. A function F’ : U + R™ is continuous 
at a point Zo if for all ¢ > 0 there exists a 6 > 0 such that 


F(Bs(@o)) C Be(F(#o)). (1.1) 


Sections A.1.2 and A.1.4 in Appendix A provide a comprehensive discussion 
about open sets in a metric space, a generalization of R”, and continuity of functions 
between metric spaces. We point out that using the language of open sets, Definition 
1.2.3 can be rephrased once more in a manner that lines up with the definition of 
continuity of functions between topological spaces. 


Proposition 1.2.4. Let F: U > R™ be a function, where U C R” is open. The 
function F is continuous if and only if F~1(V) is open for all open sets V € R™. 


Proof. Suppose the function F’ is continuous. Let V be open in R™ and let 2p € 
F~1(V), which means that F(Z) € V. Since V is open, there exists ¢ > 0 such that 
B-(F(Zo)) C V. By (1.1), there exists 6 > 0 such that F'(Bs(%o)) C Be(F(Zo)). 
This means that Bs(Z9) C F~!(V), which, since Zp was arbitrary in V, implies that 
F~1(V) is open. 

Conversely, suppose that F'~!(V) is open for all open sets V C R™. Let 7 € U 
be any point and let e > 0 be a real number. Consider the ppen ball B-(F'(Zo)). By 
hypothesis, F~'(B-(F(o))) is open in R”. Since %) € F~'(B-(F(2o))), we deduce 
that there exists 6 > 0 such that Bs(%) C F~!(B-(F(Zo))). This is equivalent to 
(1.1), so the proposition follows. 


Example 1.2.5. Consider the function F' : R” > R” defined by 


ra) — lalla, fe 46 
L) =< 3 
0, if =0. 


This function leaves 0 fixed and projects the rest of R” onto the unit sphere. If 
& #0, then 


WF@)— Feo = | ey — peel Spee ~ earl + Lea teal: 
However, 

-liigoll— lal] 2 1 ye 

laa ~ tom = Lea ~ peel #0 = NU Sra < qe ol 
and thus, 


|F(@) — F(&)|| < | — Zoll- 
[zo 


1.2. Continuity, Limits, and Differentiability 


Consequently, given any ¢ > 0 and setting 
: 255 fig) ads 25 
5 = min ((|0||, 5<ll#oll), 


we know that 7 # 0 and also that ||F(£) — F(£)|| < ¢. Hence, F is continuous at 
all % #0. 
On the other hand, if zp = 0, for all 4 Zo, 


|F(@) — FO)|| = |F@) - 4 = ||F@I =1, 
which can never be less than ¢ if e < 1. 


Example 1.2.6. As a contrast to Example 1.2.5, consider the function 


F(z) = Z, if all components of # are rational, 
7 0, otherwise. 


The function F' is obviously continuous at 0, with 6 = « satisfying the requirements 
of Definition 1.2.3. On the other hand, if % 4 0, then in Bs(Zp), for any 6 > 0, 
one can always find an # that has either all rational components or has at least one 
irrational component. Thus, if ¢ < ||Zo||, for all 6 > 0, we have 


F(Bs(%o)) ¢ Be(F(Zo)). 
Thus, F is discontinuous everywhere except at 0. 


The following theorem implies many other corollaries concerning continuity of 
multivariable functions. 


Theorem 1.2.7. Let U be an open subset of R”, let F : U > R™ be a function, and 
let F;, withi =1,...,m, be the component functions. The function F is continuous 
at the point a € U if and only if, for alli = 1,...,m, the component function 
F;:U >R is continuous at @. 


Proof. Suppose that F' is continuous at @. Thus, for all ¢ > 0, there exists a 6 > 0 
such that ||z — a@l| < 6 implies || F(z) — F'(a@)|| < ¢. Since 


|F(@) — F(@)|| = V(Fi@) — Fi(@1)? ++ + Fm (@) — Fin (@))? 


then for all ¢ > 0, having ||z — a@l| < 6 implies that 
|Fi(@) — Fi(@)| < |F(@) - F@|| <e 


for any 7. Hence, each function F; : U — R is continuous at a. 
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Conversely, suppose that all the functions F; are continuous at @. Thus, for any € 
and for all i, there exist 6; > 0 such that ||#—a|| < 6; implies | F;(#)—F;(@)| < ¢/./m. 
Then taking 6 = min(61,..., 6m), if || — 4@|| < 6, then 


|F(@) — F@|| = VIM @) — A@P +--+ |Fn(@) — Fn (@?? 


e2 e 
m m 


Thus F is continuous. 


If U is an open set containing a point d, then the set U — {a} is called a 
deleted neighborhood of @. If a function F is a function into R™ defined on a deleted 
neighborhood of a point @ € R”, it is possible to define the limit of F' at d. The limit 
of F at @ is the value L such that if F(@) were L, then F(@) would be continuous 
at ad. We make this more precise as follows. 


Definition 1.2.8. Let @ €¢ R”. Let F be a function from an open subset U — {a} C 
R” into R™. The limit of F at a is defined as the point Te and we write 


lim F(@) = L, 


Loa 
if for all ¢ there exists a 6 such that 
F(Bs(@) — {a}) c B.(L). 


We point out right away that a function fF: U — R™, where U is open in R” is 
continuous at @ € U if and only if 


lim F(@) = F(@). 

wa 
Key results in calculus and analysis are the limit laws along with their implications 
for continuity. 


Theorem 1.2.9. Let U be an open set in R”, let @ € U, and let F and G be 
functions from U — {a} to R™ andw:U —{a} > R. Suppose that the limits of F, 
G, and w at d@ exist. Then 


iim (F'(@) -G(Z)) = (im F@)) : (sim ate) (dot product) 
lim |F(@)| = im, F(a) 


1.2. Continuity, Limits, and Differentiability 


Figure 1.4: Example 1.2.12. 


Proof. (Left as an exercise for the reader.) 


Theorem 1.2.10. Let U be an open set in R”, let F and G be functions from U 
to R™, letw:U +R, and suppose that F,G, and w are all continuous at @ € U. 
Then the functions ||F'||, F+G, wF, and F'-G are also continuous at a. If m = 8, 
then the vector function F x G is also continuous at @. 


Proof. (Left as an exercise for the reader.) 


Similar to most multivariable calculus courses, before addressing partial deriva- 
tives, we introduce the notion of a directional derivative, which measures the rate 
of change of a function in a given direction. 


Definition 1.2.11. Let F be a function from an open subset U C R” into R™, let 
Zo € U be a point, and let @ be a unit vector. The directional derivative of F in 
the direction w@ at the point Zp is 


DaF (Zo) ane F (Zo + ht) _ F(o) 
h-0 h 


whenever the limit exists. 


Another way to understand this definition is to consider the curve ¥ : (—e,¢) > 
R™, for some ¢ > 0, defined by 7(t) = F(#o + tu). Then Dz F(Z) is equal to the 
derivative 7’(0). 

We note that though F is a multivariable function, the definition of DgF(Zo) 
reduces to a single variable, vector-valued function before taking a derivative. 
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Example 1.2.12. Consider the function F'(s,t) = (s? — t?, 2st) from R? to itself. 
We will calculate the directional derivative of F at Zo = (1,2) in the direction of 
a = (1/2,-V3/2). 

We can picture this kind of function by plotting a discrete set of coordinate 
lines mapped under F' (see Figure 1.4). However, for functions like F' that are not 
injective, even this method of picturing F' can be misleading since every point in 
the codomain can have multiple preimages. 

Now, 


V8 xo 1 V3 
>t), 20+ 582 - 1) 


= (-8+(2+4v3)t— 27,44 (4 2v8)e— 2v32), 


F(a + ti) = ((14 t)? — (2 


sO 


DF (¥o) = ((2 + 4V3) — 4t, (4 — 2V3) 4V/38)| = (24+4/3,4—2¥3). 
t=0 
Figure 1.4 shows the curve F'(# + tv) and illustrates the directional derivative 
as being the derivative of F(Z) +t) at t = 0. The figure shows that though @ must 
be a unit vector, the directional derivative is usually not. 


Let F be a function from an open set U C R” to R™. For any point 2p € U, 
the directional derivative of F in the direction t, at Zo is called the kth partial 
derivative of F at Zp. The kth partial derivative of F is itself a vector function 
possibly defined on a smaller set than U. Writing 


F(z) = (Fy (21, naaigtbhy )yaien plan (BiG (itn). 
some common notations for the kth partial derivative Dz, F are 


OF 


Fy, Ox,’ 


DF, Fx. 
In the last notation, the comma distinguishes the derivative operation from an 
index. It is not hard to show that 


OF ( OF, 


ae ee be OF 
Or, m Or, 


1 Bae (O1s--+ 2a). 


Example 1.2.13. Consider the real-valued function f(x1, x2) defined by 


(Biyee ey Dn) eee-s 


x02 


if 0,0 
f(a1, 22) = x? + x4? s (x1, 02) A ( ) ), 


0, otherwise. 


See Figure 1.5. We study the behavior of f near 7 = 0. 
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Figure 1.5: Graph of the function in Example 1.2.13. 


Let @ = (u1, ug) be a unit vector, with u; 4 0. Then 


~ .  f(O+ha)— f(0) _,, hPuyu3 
Daf(0) = jm, h ~ ahead + hh) 
: uyus ua 
= lim = 


h—0 (u? + h2us) uy 


If wu, = 0, then f(0 + ha) = 0 for all h, so Dg f(0) = 0. Thus, the directional 
derivative Dzf(0) is defined for all unit vectors u. 
On the other hand, consider the curve %(t) = (t?,t). Along this curve, if t 4 0, 


7 1 
fe) = p= 5 
Thus, 
i if 0 
nay = {2 ee 


which is not continuous. Notice that this implies that f as a function from R? to 
R is not continuous at 0 since taking ¢ = i, for all 6 > 0, there exist points # (in 
this case, points of the form 7 = (t?,t)) such that ||z|] < 6 have |f(#)| > e. 

Therefore, the function f is defined at 0, has directional derivatives in every 
direction at 0, but is not continuous at 0. 


Example 1.2.13 shows that it is possible for a vector function to have directional 
derivatives in every direction at some point @ but, at the same time, fail to be 
continuous at @. The reason for this is that the directional derivative depends only 
upon the behavior of a function along a line through @, while approaching @ along 
other families of curves may exhibit a different behavior of the function. 
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Example 1.2.13 also illustrates that even if all the partial derivatives of a function 
F exist at a point, we should not yet consider it as differentiable there. A better 
approach is to call a function differentiable at some point if it can be approximated 
by a linear function. 


Definition 1.2.14. Let F be a function from an open set U C R” to R™ and 
let d © U. We call F differentiable at a if there exist a linear transformation 
LZ: R" + R” and a function R defined in a neighborhood V of 0 such that for all 
he, 


F(@+h) = F(@) + L(h) + R(h), 
with " 
ag 
id (hl 


If F is differentiable at a, the linear transformation L is denoted by dF% and is 
called the differential of F at a. 


Notations for the differential vary widely. Though we will consistently use dFz 
for the differential of F' at @, some authors write dF'(@) instead. The notation in 
this text attempts to use the most common notation in differential geometry texts 
and to incorporate some notation that is standard among modern linear algebra 
texts. 

If bases B and B’ are given for R” and R™, then we denote the matrix for dF; 
by 

BI 
[dF a] B° 


Assuming we use the standard bases for R” and R™, we write the matrix for dFz 
as [dF]. 

If F is differentiable over an open set U C R”, the differential dF’ (not evaluated 
at any point) is a function from U to Hom(R”, R™), the set of linear transformations 
from R” to R™. Its associated matrix [dF ] is a matrix of functions, each defined 
over U, and we call [dF] the Jacobian matrix of F. 

If m =n, the determinant of the Jacobian matrix is simply called the Jacobian 
of F’. The Jacobian of F' is a function U + R and some common notations include 


OBigccugBay OF, 


IE); O(@1,..-,%n) Ox; 


det ( ), and det(dF). 


Differentiability at a point is a strong condition that implies both continuity 
and the existence of directional derivatives. In the propositions in the rest of the 
section, F' is a function from an open set U C R” to R™ and @ is any point in U. 


Proposition 1.2.15. If F is differentiable at ad, then F' is continuous at G. 


Proof. Suppose we have the condition of Definition 1.2.14. From Theorem 1.2.7, 
since each component function of a linear transformation is a polynomial in the input 


1.2. Continuity, Limits, and Differentiability 
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variables, we deduce that the linear transformation LD is continuous everywhere. 
Hence, 


The condition on R also implies that lim; _,5 R(h) = 0. Hence, 


lim F(@) = kim F(a+ h) = F(@) + lim(L(h) + R(h)) = F(a). 


Hence, F' is continuous at G@. 


Proposition 1.2.16. If F is differentiable at @, then it has a directional derivative 
in every direction at @. Furthermore, DgF (a) = dFz(t). 


Proof. (Left as an exercise for the reader.) 


Since the differential dF; is a linear function from R” to R™, for a vector 0 = 
(v1, V2,---,;Un) with coordinates given with respect to the standard basis, at any 
point @ we have 


dF3(vytiy + ae + Untin) = v dF x(t) + ease + Un dE (tin) 


OF OF 
=V,.—.— tes) + Un 


Ox, la OL», 


9 


a 


where the second line follows from the last part of Proposition 1.2.16. Finally, 


viewing each partial derivative gF (a) as a column vector, we have 
OF OF OF 
dF,(@) = 4 zy... 2fim\ es 
(@) (= ‘t) 55, re @) a 


This proves the following proposition. 


Proposition 1.2.17. Writing F = (Fi, Fo,...,Fim) in component functions, at 
any point where F' is differentiable, the Jacobian matriaz of F is 


OF, OF, OF, 


Or, Ot. | Onn 
OF. OF. OF. 
[dF] = OF OF Bits OF = Ber Bea - Oi (1.2) 
Ox, Ox Orn, ; . ; ’ 
Fm OFm .,, OFm 
0x4 0x2 OLn 


where in the middle expression we view OF /Ox; as a column vector. 


Example 1.2.13 shows that the implication statement in Proposition 1.2.16 can- 
not be replaced with an equivalence statement. Therefore, one should remember 
the caveat that the Jacobian matrix may exist at a point @ without F’ being differ- 
entiable at @, but in that case, dF does not even exist. 
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Example 1.2.18. Consider a function f from an open set U C R” to R. The 
differential df has the matrix 


of Of of 
d — ——. — see — . 
(afl (= Ox Orn, 
which is in fact the gradient of f, though viewed as a row vector. 


Example 1.2.19. Asa simple example of calculating the Jacobian matrix, consider 
the function 
F(a#1,22) = (324 + x, £1 COS Lo, e712 272 4 2x2). 


It is defined over all R?. The Jacobian matrix is 


OF, OF\ 

Ox, 0x2 3 2x2 

OF 2 OF 2 = _ : 

On1 Ona = cos oe Ly oo Hip) 
OF; OF; em Pte Dem Pm +9 
Ox, 0x2 


If, for example, @ = (2,7/2), then the matrix for dF; is 


3 wT 
[dFe] = 0 —2 
e2-™  _2e?-T 49 


If, in addition, v = (3, —4) with coordinates given in the standard basis, then 


3 1 3 9-47 
Ce ee le 
e~ ™ —2e +2 lle —8 


To calculate the directional derivative in the direction of v, we must use the unit 
vector u = U/||v|| = (0.6, —0.8) and 


1.8 — 0.87 
DzF(a) = dF3(i) = 1.6 
2.262-* — 1.6 


PROBLEMS 


1.2.1. Let F(x,y) = (3a —2y+4zy, x* — 3x°y? +32y+1). Determine the domain of the 
function, explain why or why not the function is continuous over its domain, and 
find all its (first) partial derivatives. 


1.2.2. Repeat Problem 1 with F(2,y) = (- — vuln oe 


1.2.3. Repeat Problem 1 with F (x,y,z) = (tan(a/y), 2°e¥***, /x? + y? + 2?). 
1.2.4. Let F(z, y,z) = (cos(4a + 3yz),2z/(l+27+y7)). Calculate Fra, Fyz and Fryz. 


1.2. Continuity, Limits, and Differentiability 


19 


1.2.5. 


1.2.6. 
1.2.7. 


1.2.8. 


1.2.9. 


1.2.10. 


1.2.11. 


1.2.12. 


1.2.13. 


1.2.14. 


1.2.15. 


1.2.16. 


1.2.17. 
1.2.18. 
1.2.19. 


1.2.20. 


Let F : R” > R be a function defined by F(Z) = e®*, where @ is a unit vector in 
R”. Prove that 

OF OF | OF 

Ox? ' Ox2 © Ox? 


If F is a linear function, show that F’ is continuous. 


Show that the following function is continuous everywhere 


Fads ie sin (4) + x2 sin (2), if riv2 £0, 


0, if e122 = 0. 


Find the directional derivative of F(s,t) = (s* — 3st?,3s?t — ¢°) at (2,3) in the 
direction @ = (1/V/2, 1/2). 

Find the directional derivative of F(x1,22,%3) = (1 + v2 + %3,%1%2 + FoN3 + 
£1%3,1122%3) at (1,2,3) in the direction of @ = (1/V2, 1/V3, 1/V6). 


Let F : R? > R? be defined by F(u,v) = (u? — v?, 2uv). Calculate the Jacobian 
matrix of F. Find all points in R? where J(F) = 0. 


Define F over R? by F(x,y) = (e” cosy, e” siny). Calculate the partial derivatives 
F, and Fy. Show that the Jacobian J(F’) is never 0. Conclude that F, and Fy 
are never collinear. 


Let F(u,v) = (cosusinv, sin usin v,cosv) be defined over [0,27] x [0,7]. Show 
that the image of F lies on the unit sphere in R?. Calculate dF(u,v) for all (u,v) 
in the domain. 


Define F : R® > R® by 
F(u,v,w) = ((u? + uv) cos w, (u® + wv) sin w, u’). 


Calculate the partial derivatives Fy, Fy, and Fy. Calculate the Jacobian J(F’). 
Determine where F' does not have maximal rank. 


Define F over the open set {(2,y,z) € R°|a > 0,y > 0,z > 0} by F(2,y,z) = 
(x-y*,y-2",z-x"). Calculate the partial derivatives F,, Fy, and F,. Calculate 
the Jacobian J(F). 


Let F : R" + R™ be a linear function, with F(v) = Av for some m x n matrix 
A. Prove that the Jacobian matrix is the constant matrix A and that for all a, 
dFz =F. 


Let F(u,v) = (ucosv,usinv,u) defined over R?. Show that the image of F is 


a cone a” + y* — z? = 0. Calculate the differential, and determine where the 


differential does not have maximal rank. 
Prove the limit laws listed in Theorem 1.2.9. 
Prove Theorem 1.2.10. 


Prove Proposition 1.2.16. [Hint: Using Definition 1.2.14, set ¢ = hu, where w is a 
unit vector.| 


Prove that if a function F is differentiable at d, then F' is continuous at d. 


1. Analysis of Multivariable Functions 


1.2.21. Mean Value Theorem. Let F be a real-valued function defined over an open set 
U € R” and differentiable at every point of U. If the segment [a,b] C U, then 
there exists a point @in the segment [@, 6] such that 


> > 


F(b) — F(a) = dF(6 — a). 


1.2.22. (*) Let n < m, and consider a function F : U > R™ of class C', where U is an 
open set in R”. Let p € U, and suppose that dF, is injective. 


(a) Prove there exists a positive number A, such that ||dF,(¢)|| > Ap||cl| for 
veER". 
(b) Use part (a) and the Mean Value Theorem to show that F' is locally injective 


near p, i.e., there exists an open neighborhood U’ of p such that F : U’ > 
F(U’) is injective. 


1.3 Differentiation Rules; Functions of Class C” 


In a single-variable calculus course, one learns a number of differentiation rules. 
With functions F from R” to R’”, one must use some caution since the matrix [dF | 
of the differential dF’ is not a vector function but a matrix of functions. (Again, we 
remind the reader that our notation for evaluating the matrix of functions [dF | at 
a point @ is [dF;].) 


Theorem 1.3.1. Let U be an open set in R”. Let F and G be functions from U 
to R™, and let w: U > R be a scalar function. If F, G, and w are differentiable 
at d, then F+G and wF are differentiable at d and 


1. d(F + G)g = dF + dG; 

2. [d(wF)z| = w(@) [dF3] + [F(a] [dwa]. 
Proof. The proof for both parts follows from Proposition 1.2.17. Explicitly for the 
second part, the ij-entry of [d(wF)a] is 


Ox; 


= w(a) 


The first term on the right side is the ij-entry of w(@)[dF%] while the second term 
is the ij-entry of [F (a)] [dwa], which is the product of a columns by a row vector. 
The result follows. 


> 


Note that in Theorem 1.3.1(2), [F'(@)] is a column vector of dimension m, while 
[dw] is a row vector of dimension n. Hence [F(@)| [dwa] is an m x n matrix of 
rank 1. 
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Example 1.3.2. Let F(u,v) = (u?—v, v3, u+2v4+1), and let w(u, v) = u?+uv—2. 
The differentials of F and w are 


2u —l 
[ar] =| 0 3? and [dw] = (3u?+v uw). 
1 2 


According to Theorem 1.3.1, the Jacobian matrix of wF' is 


[d(wF)] = w[dF] + [F] [dw] 


2u —l wv 

=(ue+uv—2)] 0 32} 4+ v (3u2 +0 u) 
1 2 u+2v+1 

2u(ue +uv—2) —(ue + uv — 2) (u? — v)(3u? + v) u(u? — v) 
= 0 3u2(u3 + uv — 2) | + v3 (3u? + v) uv? 
(u® + uv — 2) 2(u? + uv — 2) (u+2v+1)(3u2+v) u(u+2v +1) 
but — v2 — du —2uv + 2 
= 3u7u4 + v4 3u3v2 + 4uv? — 6v? 


4u3 + 6u2v + 3u? + 2uv + 2v2?+4u-—2 2uetu2?+4uv+u—4 
If we had to find [d(wF) 1,2)], we could simplify the work and do 
[d(wF’) a,2)] = w(1,2)[dF(2)| + [F(, 2)] [dwyi,2)| 


2 -1 -1 
=1{0 12]+] 8 |] (5 1) 
dy 2D 6 
2 -1 -5 -1 -3 -2 
=[{0 12}+ [40 8 ]={[40 20 
1, ‘32 30 «6 31 8 


We now consider the composition of two multivariable functions. Let F' be a 
function from a set U Cc R” to R™, and let G be a function from a set V C R? to 
R” such that G(V) C U. The composite function FoG: V > R™, depicted by the 
diagram 


V(C R*) U(C R") R™ 


is the function such that, for each dE V, 
(Fo G)(d) = F(G(a)). 


As a consequence of a general theorem in topology (see Proposition A.1.28), we 
know that the composition of two continuous functions is continuous. The same 
is true for differentiable functions, and the chain rule tells us how to compute the 
differential of the composition of two functions. 
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Theorem 1.3.3 (The Chain Rule). Let F' be a function from an open set U Cc R” 
to R™, and let G be a function from an open set V C R? to R” such that G(V) CU. 
Letd@eV. If G is differentiable at d and G is differentiable at G(a), then Fo G is 
differentiable at @ and 

d(F fe} G)a _ dF qa) ° dGz. (1.3) 


The Jacobian matrices satisfy the matrix product 
[d(F o Ga] = [dFaa] [dGa]. (1.4) 
Before proving this theorem, we establish a lemma. 


Lemma 1.3.4. Let A be an m x n matrix. For all & © R”, with ||d|| = 1, the 
length || Avl| is less than or equal to the square root of the largest eigenvalue 1 of 
A!A. Furthermore, if ti, is a unit eigenvector of A' A corresponding to d1, then 


|| Ad || = VA1- 
Proof. Assuming that we use standard bases in R” and R™, then 
|| Ad||? = (Ad) - (AB) = (Ad) (Ad) = TT AT AD. 


By the Spectral Theorem from linear algebra, since A' A is a symmetric matrix, 
it is diagonalizable, has an orthonormal eigenbasis (i.e., a basis of eigenvectors of 
A'A) {t,,...,@,}, and all the eigenvalues are real. Assume w; has eigenvalue )j. 
Note that 


so A; > 0. We also suppose that the eigenvalues are labeled so that A; > A2 > --- > 
An: 
If ¢ has unit length, we can write @ = 21%) +---+%ptn, with e7+---+22 =1. 
Then 
|| Adl|? = AT AT = Apap +--+ + Anz? 


A simple calculation using Lagrange multipliers shows that ||Ad||?, subject to the 
constraint ||@|| = 1, is maximized when = A, and (#1,...,%,) = (1,0,...,0). The 
lemma follows. 


We call VA; in the above lemma the matrix norm of A and denote it by | A]. 
Note that for all ¢ € R”, || Adj] < |A| |e]. 


of Theorem 1.8.3. Let F and G be functions as defined in the hypotheses of the 
theorem. Then there exist an m x n matrix A and an n x p matrix B such that 


Lh) 
- k) 


G(a@) + Bh + Ry(h), 


. - (1.5) 
F(g(@)) + Ak + Ro(k), 


G(a4 
F(G(a) 4 
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with *, - 
li Ha 2° «Ane li , Rath) = 0. (1.6) 
hd |[All 6 ||kII 
Then for the composition (Fo G)(@+ h), we have 
(FoG)(@+h) = F(G(@)) + ABA + AR,(h) + Ro(Bh + Ri(h)). (1.7) 
Note that ||AR,(h)|| < |A|||Ri(h)||, so 
< lim ARON <n AI Rah) 
h0 |All AO ||| 


By the Squeeze Theorem, since lim; _,5 || 21 (h)||/||Al| = 0, we deduce that 


AR, (h) h) 


lim, = 6. 


ro |All 


Also because limy_,5 \|Ri(h)||/\| hl] = 0, for any e > 0, there exists a 6 > 0 such 
that if h € R”, with ||h|| < 4, then ||Ri(A)|| < e|lhl|. In particular, pick ¢ = 1 and 
let do be the corresponding value of 6. Then if ||h|| < do, we have 


| Bh + Ra(h)l| < ||BAl| + Ra(A)|| < (Bl + Hla. 


This leads to 


R,(Bh+R Ro(Bh + Ri(h 
< [RolBR+ Ru@)I — py 1) /ReBh+ Ril) ies 
| |Bh + Ri(h)|| 
However, by Equation (1.6), one concludes that 
vy WRa(Bh+ Rall _ 9 
| 


a a0 || Bh + Ri (h)| 


and consequently, by Equation (1.8), 


| Ro(Bh + Ra(h)) I _ 
pel 


Setting R3(h) = ARi(h)+Ro(Bh+Ri(h)), we have limy_,5 R3(h)/||h|| = 0, so from 
Equation (1.7), we see that all parts of the theorem hold. 


Example 1.3.5. Consider the functions F(r,@) = (rcos0,rsin@) and G(s,t) = 
(s? — t?, 2st). Calculating the composition function directly, we have 


(Go F)(r, 0) = (r? cos? 6 — r? sin? 6, 2r? cos 0 sin @) = (r? cos 20, r? sin 26). 
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Thus, 
2rcos20 —2r? sin 20 
[d(G o F)(,6)] = oe sin20  2r? cos 20 ) 


On the other hand, we have 


25 —2t cos? —rsind 
[dG(s,2)] = © Qs ) and [4FG,a)] = ( r cos 6 ) 


Using the right-hand side of the chain rule, we calculate 


[dG Fr) [dF(r6)] oa [dG(, cos 0,r sin 91 [dF\,,9)] 


2rcos@ —2rsindé cos@ —rsin@ 


ea eae A) 


2rcos26 —2r? sin 26 
2 G sin20  2r? cos 20 ) = [dG o F)r.9)] 


as expected. 


The style of presentation of the chain rule in Theorem 1.3.3 is often attributed 
to Newton’s notation. Possible historical inaccuracies aside, Equation (1.3) is com- 
monly used by mathematicians. In contrast, physicists tend to use Leibniz’s nota- 
tion, which we present now. 


Suppose that the vector variable 7 = (y1,...,Yn) is given as a function of a 
variable = (#1,...,%p) (this function corresponds to g in Equation (1.3)) and 
suppose that the vector variable 7 = (21,...,2m) is given as a function of the 


> 


variable # (this function corresponds to f). With Leibniz’s notation, one writes the 
chain rule as 


"0% 0 
ay Be for alli=1,...,m andj =1,...,p. (1.9) 


Ox; _ 
Ox; kel 
When evaluating 0z;/Ox,; at a point @ € R”, one should understand the chain rule 
in Equation (1.9) explicitly as 


OZ; 
Ox; 


YK 
g(a) Ox; 


a5 26 


ae < Ok 


a 


Suppose a function F is differentiable over an open set U C R” to R™. Then for 
any unit vector u C R”, the directional derivative D;F is itself a vector function 
from U to R™, and we can consider the directional derivative Ds(DzF') along some 
unit vector @. This second-order directional derivative is denoted by D2,F'. Higher- 
order directional derivatives are defined in the same way. 
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If R” is given a basis, then one can take higher-order partial derivatives with 
respect to this basis. Some common notations for the second partial derivative 


O (OF 
ja; (Bas) 
OF 
Ox OX; : 


OF, DjDF, Fuazy, Fag 


and notations for third partial derivatives are 


OF 


OL ,OX jOX; : a kG iLGLk 33,5,k 


Most advanced physics texts use the notation 0;F for the partial derivative 0F'/0x;. 
In that case, the second and third partial derivatives are 0;0; f and 0,0;0;F, as 
indicated above. 
Note that the order of the indices or subscripts is important since it is possible 
that 
O° F OF 
Ox 0x2 Ox202, 5 


though we will see momentarily a condition that implies their equality. 
We conclude this section with two theorems from analysis and a comment on 
the C” notation. 


Theorem 1.3.6. Let U be an open set in R”, let F : U > R™ be a function, and 
let @€ U. Suppose that for each i = 1,2,...,n, the partial derivative OF /Ox; exists 
in a neighborhood of & and is continuous at ad. Then F is differentiable at G. 


Proof. (See Theorem 8.23 in [15].) 


Theorem 1.3.7 (Clairaut’s Theorem). Let U be an open set in R”, let F: U > R™ 
be a function, and let dE U. Suppose that 
OF OF OF 
and 


Ox,” Ox; : Ox OX; 


exists in a neighborhood of &@ and that 0?F/0x;0x; is continuous at @. Then 
O° F/dx,0x;(a) exists and 
OF OF 
~——— (a) = —— (a). 
Oxj;OX; Ox OX; 


Proof. (See Theorem 8.24 in [15].) 


Theorems 1.3.6 and 1.3.7 illustrate that certain nice properties occur when we 
not only assume that partial derivatives exist but that they are continuous at a 
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particular point. For this reason, if U is an open set in R”, we say that a function 
F :U — R™ is of class C’ if all of its partial derivatives exist and are continuous. 
By Theorem 1.3.6, a function of class C+ is differentiable. We denote by C!(U,R™) 
the set of such functions. 

More generally, we say that the function F is of class C", or write F € C"(U,R™), 
if all of its first rth partial derivatives exist and are continuous. By Clairaut’s The- 
orem, we see that for a function of class C” all the mixed partial derivatives up to 
order r involving the same number of the same index of variable are equal. To be 
consistent with this notation, we say that F is of class C® if it is continuous and 
we say that it is of class C™ if all of its higher partial derivatives exist and are 
continuous. Functions of class C™ are called smooth. 

Finally, we say that a function F': U — R”™ is analytic if for all a € U, there 
exists an open ball Bs(@) C U such that over Bs(@) the Taylor series of F' centered at 
a converges to F(Z) in Bs(@). If F : U > R™ is analytic, we write F € C’(U,R™). 
This is a stronger condition than smooth since in order for a function to be analytic 
at d, all of its partial derivatives must exist at d. 

There is a natural chain of containment among these classes of functions 


C"(U,R™) C C?(U.R™) C+ COU R™) € 4 COU, R™. 


Theorem 1.3.8 (First-Order Taylor Series). Let d € R” and let U = B,(@) be the 
open ball of radius r and center @. Suppose that f ¢ C*(U,R) fork >1. Then 


f(€) = f(@) + De De, (@) (ai — ai) + do 9 (F (ai — aj) (1.10) 


for some functions 91, 92,---;gn € C*-1(U,R) such that g;(@) = 0. 


Proof. Let & be any element in the ball U. The Fundamental Theorem of Calculus 
gives 


f(@) -£@ =| © f(a-+ 0a — a) dt. 


By the chain rule 


®-1@= i SF a+ He - a) (i — ai) at 


and since Of /Ox;(@)(x; — a;) is constant with respect to t, we have 


fe)=10) = SE aleve) aia) [ (Sha +u@—ay - F4@) a 


Setting 


1.3. Differentiation Rules; Functions of Class C” 


27 


we obtain (1.10). We note that g;(@) = 0. Furthermore, it is possible to differentiate 
each g;(x) by passing the differentiation with respect to any x; variable underneath 
the integral with respect to t. Hence, we see that g; € C*~!(U,R) for all i = 


i eee 7) 
PROBLEMS 

1.3.1. Prove Theorem 1.3.1. 

1.3.2. Suppose that f and g are differentiable at @ € R”. Prove that the function ip g 
is differentiable at @ and that 

d(F’- G)¢ = F(a) -dGz + G(d@) - dF 
are linear functions. 

1.3.3. Let F(r,0,¢) = (rcos@sin¢,rsin@ sin ¢,rcos ¢). Calculate the Jacobian matrix. 
Prove that the Jacobian is the function r? sin ¢. 

1.3.4. Let 

21 = 2y1 + 3y2, Yr =e"l+a2+%3, 
. and ee 
22 = y1yo, yo Serr" + a1. 
Use the chain rule to calculate the partial derivatives oa for 7 = 1,2 and j = 
1,2,3. : 

1.3.5. Let F be a differentiable function from an open set U C R” to R”, and let G be 
a differentiable function from an open set V C R” to U. Prove that J(Fo G) = 
J(F)J(G). 

1.3.6. Suppose that U and V are open sets in R” and that F is bijective from U to V. 
Suppose in addition that F is differentiable on U and F'~' is differentiable on V. 
Prove that for all @ € U, the linear function dF is invertible and that 

(dFz)~* = dF aia. 
Conclude that J(F7') = 1/J(F). 
1.3.7. Let F be a function from U C R? to R® such that dF has rank 2 for all # € U. 
Let @ be a regular curve from an interval J to U. Show that 
(a) the function A(t) = F(@(t)) is a regular curve in R*; 
(b) the speed of f satisfies 
I Sel = Vall Gey +2Ge Se) Ge + Le Gey 
ot 0x4 Ox, Ox2/ dt dt 0x2 , 
1.3.8. Repeat part (b) of Problem 1.3.7, but prove that 


5 (IP = (a(t) " [AF] | [dF] a2). 


[Hint: Recall that we view the vectors 4, 6 € R” as column vectors and @-b = @"b 
as a matrix product.| 
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1.3.9. Let F(s,t) = (s*t+t°, te’ +se*), and let & be the unit vector in the direction (1, 1) 
and @ be the unit vector in the direction (2,3). Calculate the second directional 
derivative function D?,F. [Hint: This is a function of (s, t).] 

1.3.10. Let F be a function from an open set U C R” to R™. Let v and @ be two unit 
vectors in R”. Prove that 


1.3.11. Let F(r,@) = (rcos6,rsin@). Calculate all the second partial derivatives of F’. 
Prove that F is of class C® over all of R?. 


1.3.12. Let F(u,v) = (u? + ve?",v + tan7/(u + 3),sinv). Find the domain of F. Cal- 
culate all of its second partial derivatives. Calculate the following third partial 
derivatives: Fouvu, Fouv, and Fuuv- 


1.3.13. If (wi, we) = (e-™1*"3, cos(a2 + 23)), calculate 


O?wi O?wi ani Owe 
0x1 0x3 : 0x302X2 , Ox, 0x2023 ; 


1.3.14. Let the function f : R? > R be defined by 
2st(s? — t? : 

f(s t)= a if (s,t) 4 (0,0), 

0, if (s,t) = (0,0). 


Show that f is of class C'. Show that the mixed second partial derivatives fs: 
and fs exist at every point of R?. Show that fst(0,0) A fes(0, 0). 


1.4 Inverse and Implicit Function Theorems 


In single- and multivariable calculus of a function F’ : R” — R, one defines a critical 


point as a point @ = (a;,...,a,) such that the gradient of F at @ is 0, ie., 
OF OF = 
VF (4a) = | ——(d),..., ——(@) ) =0. 
Ox OL 


At such a point, F' is said to have a flat tangent line or tangent plane, and, according 
to standard theorems in calculus, F'(d@) is either a local minimum, local maximum, 
or a “saddle point.” This notion is a special case of the following general definition. 


Definition 1.4.1. Let U be an open subset of R” and F': U > R™ a differentiable 
function. We call q € U a critical point of F if F is not differentiable at q or if 
dF, : R” — R”™ is not of maximum rank, ie., if rank(dF,) < min(m,n). If q is a 
critical point of F’, we call F(q) a critical value. If p € R™ is not a critical value of 
F (even if p is not in the image of F’), then we call p a regular value of F. 
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We point out that this definition simultaneously generalizes the notion of a 
critical point for functions F : U > R, with U an open subset of R”, and the 
definition for a critical point of a parametric curve in R” (Definition 3.2.1 in [5]). 
If m =n, the notion of a critical point has a few alternate equivalent criteria. 


Proposition 1.4.2. Let U be an open subset of R”, F : U — R” a differentiable 
function, and q a point in U such that F is differentiable at q. The following are 
equivalent: 


1. q is a critical point of U. 
2. J(F)(q) = 0. 


3. The set of partial derivatives {OF /0x\(q),...,OF/Oxn(q)} is a linearly de- 
pendent set of vectors. 


4. The differential dFy is not invertible. 


Proof. These all follow from Theorem 1.1.8. 


More generally, when n is not necessarily equal to m, linear algebra gives the 
following equivalent statements for when q is a critical point. 

Determining for what values of gq in the domain U the differential dF, does not 
have maximal rank is not easy if done simply by looking at the matrix of functions 
[dFy] . The following proposition provides a concise criterion. 


Proposition 1.4.3. Let F : U > R™ be a function where U is an open subset of 
R”. Let q € U such that F is differentiable at q. Then the following are equivalent: 


1. q is a critical point of F. 
2. The determinants of all the maximal square submatrices of [dF 4] are 0. 


3. The sum of the squares of the determinants of all the maximal square subma- 
trices of [dFj] is 0. 
Furthermore, ifn >m and A= |dFy], then q is a critical point of F if and only if 
det(AA') £0. 


Proof. To prove 1 & 2, note that by definition, q is a critical point if dF, does 
not have maximal rank, which means that the set of column vectors or the set of 
row vectors of [dF,] is linearly dependent. This is equivalent to the determinants 
of all maximal submatrices of A (sometimes referred to as the maximal minors of 
A) being 0 since, if one such determinant were not 0, then no nontrivial linear 
combination of the columns of [dF,] or of the rows of [dF,] would be 0, and hence, 
this set would be linearly independent. 

The equivalence 2 © 3 is trivial. 

To prove the last part of the proposition, assuming that n > m, recall that 
if {v1,...,%m} are vectors in R”, the m-volume of the parallelepiped formed by 


{v1, eae Um} is 
,/det(B' B), 
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where B is the n x m matrix, with the @; as columns (see [14, Fact 6.3.7]). Now 
the m-volume of this parallelepiped is 0 if and only if {v1,...,0m} are linearly 
dependent. Thus, taking B = A' and taking the d; as the columns of A! establishes 
the result. 


By referring to some advanced linear algebra, it is possible to prove directly 
that, ifn > m, Condition 3 in the above proposition implies that det(AA') 40. In 
fact, even more can be said. If A is an m x n matrix with n > m, then det(AA') 
is equal to the sum of the squares of the maximal minors of A. (See Proposition 
C.1.2 in Appendix C.) 


Example 1.4.4. For example, consider the function F : R*? —> R? defined by 
F(a, y, 2) = (x? + 3y4+ 23, ay + 27 +1). The Jacobian matrix for this function is 


2x 3 32? 
[ar] = (7 : ar 


In this case, the easiest way to find the critical points of this function is to use the 
second equivalence statement in Proposition 1.4.3. The maximal 2 x 2 submatrices 


are 
2x 3 2¢ 327 3 327 
y a)’ a ia x 2z i)? 


so since critical points occur where all of these have determinant 0, the critical 
points satisfy the system of equations 


2a? — 3y = 0, 
Arz — 3yz? =0, 
6z — 3az7 = 0. 
This is equivalent to 
y = 32”, y = 30°, 
Anz —2272z7=0, <> ( 22z(2—22z) =0, 
2(2—2z) =0, 2(2— az) =0. 


Thus, the set of critical points of F is 


{ (#522) R®|2 R (ophuf (2,524.0) R®|a Rh. 


The set of critical values is then 


2 4 2 
{(30+ 5508+ 3 +1) R?| a R Cop} u {(ae?, 32 +1) R? |x Rh 
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One important aspect of critical points already arises with real functions. With a 
real differentiable function f : [a,b] > R, if f’(ao) = 0, one can show that f does not 
have an inverse function that is differentiable over a neighborhood of x9. Conversely, 
if f’(zo) 4 0, the function f has a differentiable inverse in a neighborhood of xo, 


with 
1 


(f-*)'(yo) = FG)" 


A similar fact holds for multivariable functions and is called the Inverse Function 
Theorem. 

The proof of the Inverse Function Theorem and the following Implicit Function 
Theorem are quite long and not necessary for the purposes of this book, so we refer 
the reader to a book on analysis for a proof (see, for example, Section 8.5 in [15]). 
Instead, we simply state the theorems and present a few examples. 


Theorem 1.4.5 (Inverse Function Theorem). Let F' be a function from an open 
set U C R” to R”, and suppose that F is of class C", withr > 1. Ifq €U is not a 
critical point of F’, then dF, is invertible and there exists a neighborhood V of q such 
that F is one-to-one on V, F(V) is open, and the inverse function F~!: F(V) > V 
is of class C’. Furthermore, for allp € F(V), with p = F(q), 


d(F~")p = (dFy)~*. 


In many situations, it is impossible to explicitly calculate the inverse function 
F-!. The following example illustrates the Implicit Function Theorem in a situation 
in which we can calculate the inverse function. 


Example 1.4.6. Consider the function F(s,t) = (s? —t?, 2st) and q = (2,3). Note 
that F is defined on all U = R?. The Jacobian matrix is 


2s —2t 
G 2s ) , 
so the Jacobian is the function J(F)(s,t) = 4(s? + t?). By Proposition 1.4.2, the 
only critical point of F’ is (0,0), so F' satisfies the conditions of the Inverse Function 
Theorem at gq. 
Now with q = (2,3), by the Inverse Function Theorem, since p = F(q) = 
(—5, 12), we have 


far] =(§ YP) ana (wr 1= (nl = 5 (2, 3). 


For simplicity, let us assume V = {(s,t) € R?|s > 0,t > O} and note that 
q€V. Setting (x,y) = F(s,¢) and solving for (s,t), we find that F(V) = {(z,y) € 
R?|y > 0} and that the inverse of F is given by 


| Dos 2 | Dey Dis 
sae wa p= fe 
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Calculating the partial derivative 0s/0x, we have 


Os 1 x 
0. = 2 ae 
‘ 2/2, e+y? +2 vrr+y 

1 Aho eye 


9 x? + y? 2 


>) 


and similarly, the Jacobian matrix of F~! 

jvewe +2 J ve v2+y?—2 
2x? + y? 4) Me w+y?—2 (ae ta 
Plugging in p = (—5,12) = F(q), we calculate directly that 


os - 1 jv HELE 5 jv Be 12245 
Fr) ~ 24/(—5)? + 12? (Lanes (Lae 5 


=3(7, 3)=t0" 


thereby illustrating the Inverse Function Theorem. 


[dF ~'] = 


Another important theorem about functions in the neighborhood of a point p 
that is not critical, is the fact that the level set through p, can be parametrized by 
(is the image of) an appropriate function. This is the Implicit Function Theorem. 


Theorem 1.4.7 (Implicit Function Theorem). Let F' be a function from an open 
set U C R” to R™, withn > m, and suppose that F is of class C’, with r > 1. Let 
q € U, and let % be the level set of F through q, defined as 


= {¢ € R”" | F(z) = F(q)}. 
Ifq€U is not a critical point, then the coordinates of R” can be relabeled so that 
dF = ( S | T ) m, 


with T an m x m invertible matriz. Then there exist an open neighborhood V of 
q in R", an open neighborhood W of a = (q1,---;dn—m) in R"—™, and a function 
g:W > R™ that is of class C” such that UNV is the graph of g, 1.e., 


EAV ={(6,9(8)) |e Wh. 
Furthermore, the Jacobian matrix of g at a is 


[dga] =-T'S. 
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Example 1.4.8. Let F : R? > R! be a function. If c is some constant, we expect 
that the solution set © to the equation F(x, y,z) = c is a surface in R*. Suppose 
that around some point p that satisfies the equation, we could consider & as the 
graph of a function z = f(x,y). If it is not tractable to exactly solve for z and get 
f(x,y) implicitly, then f is called an implicit function. By chain rule we have 


OF Ox OF Oy OF Oz _ 6 aad OF Ox | OF Oy . OF Oz _ 0 
Ox Ox = =Oy Ox Oz Ox Ox Oy OyOy Oz Oy 
Hence, 
OF AF 0 _ 5 qyq OF , ORO: _y 
Qn Oz0n Oy Oz dy 
= dz OF /0F 02 OF /0F 
= and = : 
Ox Ox! Oz Oy Oy! Oz 


This work is called implicit differentiation. Organizing this last line into a matrix 
of a differential, we have 


OF\~' (dF OF = 
laf] = (=) (3 =) ae 
where [dF] = ( S | T ) as in the Implicit Function Theorem. In this example, we 
began by assuming that a neighborhood of p € © could be viewed as the graph of 
z= f(x,y) and proceeded from there without knowing that we were allowed to do 
so. The Implicit Function Theorem gives a condition in which we are allowed to 
proceed as we did. In this specific case, the theorem states that we can make this 


assumption when OF /0z 4 0, which is precisely what is required for our calculations 
to have meaning. 


Example 1.4.9. We use the Implicit Function Theorem to tell us something about 
the set 
D={(2,y,z) eR? |e? +y? +22 =landr+y+z=1}. 


This is the intersection between a sphere and a plane, which is a circle lying in R°. 
(In Figure 1.6, © is the circle shown as the intersection of the sphere and the plane.) 
Consider the point g = (+. -3, +) € X&. To study © near q, consider the function 


F : R® > R? defined by F(z, y, z) = (a? +y2 + 27,x2+y+z). The Jacobian matrix 
of F is 
— (2u wy 2z 
arl=(7 % %). 


and so the critical points of F are points (x,y,z) € R® such that x = y = z. Thus, 
q is not a critical point and 
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Figure 1.6: Example 1.4.9. 


Writing 


3 —_6 24 
=Sf\a3 = 13 OB 
s=(#) and r= (4 we 


since T is invertible, F' and q satisfy the criteria of the Implicit Function Theorem. 
Thus, there exist an open neighborhood V of q in R°, an open interval W around 
a= 4 in R, and a function g : W > R? such that the portion of the circle UNV 
is the graph of g. Also, the Jacobian matrix of g at a (the gradient of g at a) is 


4 13 24 8 2238 
_3,(4)\ ~_qp-i¢_ _ [730 30) faz) _ 15 
dga Vo(=) T-1§ & a (3) e) (1.11) 


One can find © by first noting that the subspace « + y+ z= 0 has {(0,—1,1), 
(—2,1,1)} as an orthogonal basis. Thus, the plane x+y+z = 1 can be parametrized 


by 


> 111 
x = PaO. S11) A944 7). 
(ue) = (F505) Hu.) + 0-211) 
and all vectors in this expression are orthogonal to each other. The additional 
condition that x? + y? + 2? = 1 be equivalent to X - X = 1 leads to 2u? + 6v? = 3. 
This shows that the set © can be parametrized by 


=1_ 24; 
Lo=3 3 sint, 
y =3- yqcost + 3 sint, 
a t1g 
2 = 3t 5 Cost 3 sint. 
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However, this parametrization is not the one described by the Implicit Function The- 
orem. But by using it, one can find that in a neighborhood of g = (4/13, —3/13, 12/13), 
xX is parametrized by 


ae ae ieee a 
(«. pn gi +2 Age ee ae 1420-328), 


2 2 2 


and thus the implicit function g in Theorem 1.4.7 is 


1- 1 1 1 
ate) = ( = — 51+ 2e 3x2, = ae V14+ 2x am). 


2 2 2 
From here it is not difficult to verify Equation (1.11) directly. 


Example 1.4.9 illustrates the use of the Implicit Function Theorem. However, 
though the theorem establishes the existence of the implicit function g and provides 
a method to calculate [dga], the theorem provides no method to calculate the 
function g. In fact, unlike in Example 1.4.9, in most cases, one cannot calculate g 
with elementary functions. 


PROBLEMS 


1.4.1. Find the critical points of the following R > R functions: (a) f(x) = 2°, (b) 
g(x) = sina, and (c) h(x) = xv? — 3027 +e+1. 


1.4.2. Find all the critical points of the function F(x, y) = (x? — y+y’, x? — y) defined 
over all R?. 


1.4.3. Let F : R® > R® be defined by F(z, y,z) = (2? — vy, 2° — 3xyz,27 + y? +2”). 


(a) Find an equation describing the critical points of this function. (If you have 
access to a computer algebra system, plot it.) 

(b) Prove that if (xo, yo, Zo) is a critical point of F’, then any point (Azo, Ayo, 20), 
with A € R, is also a critical point. (That is, if (xo, yo, zo) is a critical point, 
then any point on the line through (0,0,0) and (20, yo, Zo) is also critical. 
We say that the equation for the critical points is a homogeneous equation.) 


1.4.4. Let F : R® > R? be defined by F(x,y,z) = (e*¥,zcosx). Find all the critical 
points of F. 
1.4.5. Consider the function f : R®? > R® defined by 


f (v1, 02,23) = (1 Cos 2 Sin 3, ©; Sin Xe Sin x3, ©1 COS X3). 


Find the critical points and the critical values of f. 


1.4.6. Let F : R® > R? be the function defined by F(s, t) = (s? — 3st”, 3st — t?), and let 
q = (2,3). Find the critical points of F’. Prove that there exists a neighborhood 
V of q such that F is one-to-one on V so that F~' : F(V) > V exists. Let 
p = F(q) = (—46,9). Find d(F~*)(~46,9)- 
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1.4.7. 


1.4.8. 


1.4.9. 


1.4.10. 
1.4.11. 


1.4.12. 


Let F : R® — R? be defined by F(a, y) = (y’ sinx +1, (x + 2y) cosy). Show that 
(0, 7/2) is not a critical point of F’. Show that F is a bijection from a neighborhood 
U of (0, 7/2) to a neighborhood V of (1,0). If G: V — U is the inverse function 
G =F", then find the matrix of dG (1,0): 


Consider the function F : (1, +00)? — (1, +00)? defined by F(a, y) = (x, y”). 
(a) Find the set of critical points of F’. 
(b) Show that on a neighborhood V of gq = (2,3) the function F is one-to-one. 


(c) Calling F :~': F(V) > V the inverse function near q, use the Inverse Func- 
tion Theorem to determine d(F~') g,9)- 


Consider the function 


f (x1, 2, 3) =a 


2+ x3 %1+ 243 t+ 22 ) 
l+ait+a2+23’ ltai+22+273’1+a1+224+ 23 


defined over the domain U = R* — {(a1, x2, 73) |1 +21 +22 +23 =O}. 


(a) Show that no point in the domain of f is a critical point. [Hint: Prove that 


(b) Prove that f is injective. 


(c) Find [df~*] in terms of (21,22, 73) at every point using the Inverse Function 
Theorem. 


(d) Show that the inverse function is 


=| — f ay t+ yo + ys Yi — y2+ ys yi + Yy2 — Y3 ) 
fe (y1, Ye, y3) (5 V1 Ye ys” 9 y YP ys” 5 YW YP Y3 : 


(e) Prove that f is a bijection between U = R? —{(a1, 22,73) |1+a1+22+23 = 
0} and V = R® — {(#1, 22,23) |2— 21 — a2 — 23 = O}. 
Verify all the calculations of Example 1.4.9. 
Let X be the curve in R® defined by 


4a? + 5y? + 2? = 33 

x? + 4y? + 22” = 35. 
Using the Implicit Function Theorem, show that near the point gq = (1, 2,3), 
Scan be parametrized by (x, 91(),g2(x)). Find [dgi] and use this to give a 
parametrization of the tangent line to © at q. 


Let © be the level set in R* defined by 


x? + 2y? + 327 + 4w? = 24 
xew — 2y?z? + w? = 20. 


Let F(a, y,z,w) = (@? + 2y? + 32? + 4w?, 2? w — 2y?z? + w). 


(a) Prove that q = (3,2,1,1) is not a critical point of F’ and observe that q € D. 
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(b) Using the Implicit Function Theorem, show that there is an open neigh- 
borhood W of a = (3,2) in R® and a function g : W — R? such that a 
neighborhood of q in © is the graph of g. 


(c) Calculate [dg] over W. 


(d) Use this to provide a parametrization of the tangent plane to © at q. 


Taylor & Francis 
Taylor & Francis Group 
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CHAPTER 2 


Variable Frames 


The strategy of choosing a particular coordinate system or frame to perform a 
calculation or to present a concept is ubiquitous in both mathematics and physics. 
For example, Newton’s equations of planetary motion are much easier to solve in 
polar coordinates than in Cartesian coordinates. In the differential geometry of 
curves, calculations of local properties are often simpler when carried out in the 
Frenet frame associated to the curve at a point. (See [5, Chapter 3].) This chapter 
introduces general coordinate systems on R” and the concept of variable frames in 
a consistent and general manner. 


2.1 Frames Associated to Coordinate Systems 


Many problems in introductory mechanics involve finding the trajectory of a particle 
under the influence of various forces and/or subject to certain constraints. The first 
approach uses the coordinate functions and describes the trajectory as 


F(t) = (w(t), y(t), 2) = 27+ yOT+ 2(0)k. 


Newton’s equations of motion then lead to differential equations in the three coor- 
dinate functions x(t), y(t), and z(t). The velocity function is the derivative, namely 


F(t) = S(oltya) + Lv + SOR) 


because 47 = 0, 47 = 0, and ay = 0. This last remark shows that the frame 


(2, 7,k) associated to the Cartesian coordinate systems is a constant frame. 
As we discuss variable frames, we introduce a nice way to describe the rate 
of change of a variable frame. Suppose that {t1, %2, 3} is a basis of R® and let 


G@ and b be two other vectors with components @ = a,tiy + agti2 + agtiz and b = 
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bitty + botiz + b3t3. Assuming that all vectors are column vectors, we can write 
these component definitions of d and b in the matrix expression 


ay by 
(a b) = (it tig tig) | az be 
a3 bs 


Using this notation, we can express the relationships 47= 0, t7= 0, and ay =0 
by 
d [ 0 0 0 
“(@ 7 k)=( 7 &) 0 0 Ol. (2.1) 
0 0 0 
This notation appears trivial but it will become important as we study the behavior 
of frames associated to other natural coordinate systems. 
Using cylindrical coordinates, we locate a point in R® using the distance r be- 
tween the origin and the projection of the point onto the xy-plane, the angle from 


the positive z-axis 0, and the height z above the ry-plane. See Figure 2.1. We 
have the following relationship between Cartesian coordinates and cylindrical coor- 


dinates: 
x=rcos6, r= V/a+y?, 
y=rsind, <> 40=tan + (4), (2.2) 
Z2=2, L= ze. 


Of course, by the expression tan~! (4), one must understand that we assume that 


x > 0. For xz < 0, one must adjust the formula to obtain the appropriate corre- 
sponding angle. Using cylindrical coordinates, one would locate a point in space 
by 

Fr = (rcos@,rsin 8, z). 


We define the natural frame with respect to this coordinate system as follows. 
To each independent variable in the coordinate system, one associates the unit 
vector that corresponds to the directions of change with respect to that variable. 
For example, with cylindrical coordinates, we have the following three unit vectors: 


&=F/lark &=ap/lab ¢&=5:/lzcl 


These formulas give us explicitly 


L 


(2.3) 


o 


é, = (cos 6, sin 6,0) = cos 67+ sin 07; 
9 = (— sin, cos 6,0) = — sin 07’+ cos 67, (2.4) 
é, = (0,0,1) =k. 


2.1. Frames Associated to Coordinate Systems 


41 


Using this new frame, the position vector of a point with cylindrical coordinates 
(r, 0, z) is 
r= ré, + zé,, (2.5) 


As opposed to the fixed frame (i, ji k), many frames associated to non-Cartesian 
coordinates often depend on the position of the base point p of the frame. In this 
case, the frame (€;, &, €,) associated to cylindrical coordinates depends explicitly 
on the coordinates (r,@,z) (in this case, only on @) of the frame’s origin point. 

To see how this frame varies with respect to any parameter, consider a space 
curve parametrized by 7 : I — R?, where J is an interval of R. We can attach 
the frame (é,,€9,é.) to each point 7(t) of the curve, but, unlike with the fixed 
Cartesian frame, the frame (€;.,é),é,) is not constant. As we study motion in the 
new coordinate system, we are led to take higher derivatives of r(t) and express 
them with components in the frame associated to the particular coordinate system. 

If r(t) is a space curve, then r, 6, and z are functions of t. Therefore, taking the 
derivative with respect to t, we get 

a ee ee os @s. x 5S d. 

p= Bre) + a ee) =r'é,+ rae + 2'€, + 2a: 
Thus, in order to write equations of motion in cylindrical coordinates, we must 
determine 4é,, 4, and Le. We obtain 


Co ee 6, sin 0,0) = (—6’ sin 0, 0’ cos 6,0) = 6’€9, 


dt 

& = at sin 0, cos 0,0) = (—6’ cos, —6’ sin 0,0) = —0’é,., 
d = 

é’ = —(0,0,1) =0. 

€, rr bees J ) 


Following the same method of presentation as in (2.1) the change of the cylin- 
drical coordinates frame can be expressed as 


F 0 6 0 
<(& & &)=(@% & &)[e 0 oO}. (2.6) 
0 0 0 


An application of the cylindrical frame and its rate of change arises when de- 
scribing the velocity vector and acceleration vector: 
Pr =r'é, +r ey + 2'E, 
PY =r" +10 + 1/0 eo + 1" & + 7(6')?(—E,) + 2"E, 


= (r” —r(0))E, + (270 + 10" Sey + 2"Ez. 


If we restrict ourselves to polar coordinates, the above formula would still hold but 
with no z-component. In the study of trajectories in the plane, the first four terms 


2. Variable Frames 


Figure 2.1: Cylindrical coordinates. Figure 2.2: Spherical coordinates. 


in the last expression have particular names (see [22, Section 5.2]). We call 


ré, the radial acceleration, 
—r(6’)*é, the centripetal acceleration, 
2r'6'E the Coriolis acceleration, and 


r0"€) the component due to angular acceleration. 


Example 2.1.1. Using spherical coordinates, we locate a point P in R® as follows. 
Let P’ be the projection of P onto the xy-plane. Use the distance from the origin 
p = OP, longitude @ (i.e., the angle from the positive z-axis to the ray [OP’)), 
and the angle y, which is the angle between the positive z-axis and the ray [OP). 
See Figure 2.2. Elementary geometry gives the relationship between Cartesian 
coordinates and spherical coordinates: 


= ,/p2 ae 4 2 
x= pcosé@siny, re ce io 


y=psinésiny, <> 6= tan (4), (2.7) 


_ es 
z= pcosy, (yp = cos (===): 


with the same caveat for @ as discussed with cylindrical coordinates. With spherical 
coordinates, we usually assume that p > 0,0<6< 27, and0<y<n. 


We leave it as an exercise for the reader to determine the frame associated to a 
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spherical coordinate system as 
, oF /|= 
€& = = /||— 
° ~~ ap! \ldp 
, FF /|% 
Ce 
° 00/ N10 
ie, OF /| or 
% Op! Nop 

In contrast to Cartesian coordinates, where the position vector of a point is 7 = 


vit yj + zk, and in contrast to cylindrical coordinates where the position vector is 
given by (2.5), in spherical coordinates, the position vector is simply 


| = (cos @sin vy, sin @ sin y, cosy), 


| = (—sin0,cos 90,0), 


= (cos 6 cos y, sin # cos y, — sin y). 


r= p€p. 


To discuss how the frame associated to spherical coordinates changes, consider 
a parametric curve 7: I > R? and calculate how &,, & and &, change as t changes. 
Again, we leave it as an exercise for the reader to show that 


F 0 —Wsnyp —y’ 

a & &)=(& & E,) | O sing 0 cosy |. (2.8) 
/ / 
yp —0' cosy 0 


All the coordinate systems we have considered thus far, though curvilinear, 
are examples of orthogonal coordinate systems; the basis vectors associated to the 
coordinate system form an orthogonal basis of R”. In general, this is not the case. 
We point out that, as shown in (2.4), both mathematicians and physicists make the 
traditional choice when they impose that the frames associated to the cylindrical 
and spherical coordinate systems be composed of unit vectors. As useful as this is 
for calculations involving distances or angles, this choice has some drawbacks. 

We now consider general coordinate systems in R”. Already in polar, cylindri- 
cal, and spherical coordinates, we encounter some challenges in bringing together 
practical application and precision. For example, polar coordinates (r, 6) do locate 
points uniquely in the plane and for every point p in the plane, there do exist some 
(ro, 99) that correspond to p. However, the assignment p = f(r, @) is not injective. 

Let S' be an open set in R”. A continuous surjective function f : U > S, where 
U is an open set in R”, defines a coordinate system on S by associating to every 
point P € S an n-tuple 2(P) = (x'(P),a?(P),--- ,v"(P)) such that f(x(P)) = P. 
In this notation, the superscripts do not indicate powers of a variable x but the zth 
coordinate for that point in the given coordinate system. Though a possible source 
of confusion at first, differential geometry literature uses superscripts instead of the 
usual subscripts in order to mesh properly with subsequent tensor notation. As 
with polar coordinates where (ro, 60) and (ro, 40 + 27) correspond to the same point 
in the plane, in practice the n-tuple need not be uniquely associated to the point 
P. However, it is not uncommon for the sake of proofs to restrict f to a smaller 
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domain V Cc U so that fly is a bijection with the corresponding x: f(V) > V as 


the inverse. Note that in this latter case, we call x = (a1, 2?,..., 2”) the coordinate 
functions, or the coordinate system 

Let (x',x?,...,2") be a coordinate system in R”. Since R” is a vector space, 
we can talk about position vectors of points in R”. To say that the n-tuple 
(x',x7,...,x2") gives coordinates of a point p means that p has a position vector 
F that is a function in the n variables (x',x?,...,2"). In our present formulation, 
the position function 7(x1,x?,...,2") is precisely the function f. 


Definition 2.1.2. Let « : S — U be a coordinate system on an open subset 
S CR”. If pis not a critical point of x, then the frame (or basis) associated to this 
coordinate system at p is the set of vectors 


(eat Or 
Ox} 


p Ox? 
If there is no cause for confusion, we drop the |, but understands from context that 


or 


p> aan 


i} (2.9) 


derivatives are evaluated at a point p. We say that the components of a vector A 


at p in this system of coordinates are (A!, A?,..., A”) if we can write 
n 
7 oF 
A= que (2.10) 
= Ox* 


Note that since p is not a critical point of x, then dz, is invertible with inverse 
(dx»)~' = dfx(p). Hence, the columns of [df;(p)], which are precisely these vectors 
Or/Ox"|x(p), form a linearly independent set. In general, this condition of linear 
independence is all we can assume from a frame associated to a general coordinate 
system at p, namely, it need not be an orthogonal set of vectors or consist of unit 
vectors. If the set of vectors (2.9) is an orthogonal set, then the system of coordinates 
is called an orthogonal coordinate system. 


Definition 2.1.3. Let «:S — U be an orthogonal coordinate system on an open 
subset S CR”. The scale factors of this coordinate system at point p that is not a 
critical point are hy1,hz2,..., hyn, where 


Or 
Pigg — Peres 
Ox* 
When a coordinate system (x!,x?,...,2”) is orthogonal, it is common to di- 
yi b ’ é fA 


vide the basis vectors 07/0x' by the scale factors to obtain an orthonormal basis 
associated to the coordinate system. This is precisely what we did with both the 
cylindrical and spherical coordinate systems. 

Another interesting aspect to using frames associated to coordinate systems in- 
volves how to consider rates of change of a vector field when expressed with respect 
to a variable frame. Let U be an open subset of R”. Let {t1, t2,...,Un} be a vari- 
able frame defined over U, i.e., each vector i; is a vector function @;(21, a?,..., 2”) 
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that is differentiable on U and for each (x!,x?,...,2") the collection of vectors is 
linearly independent. Let V be a vector field defined on U. At each point p € U, 
the vector V(p) can be decomposed into components V‘(p) as 


> 


V(p) = V"(p)tti (p) + V? (p)ti2(p) +--+ + V" (p)tn (p). 


More concisely, V =V1 Gt, 4+V2ao4+---+ V"dn, where we understand each V/ to be 

a function on U. (Again, the superscripts are indices and not powers. We explain 

this convention in more detail when we discuss multilinear algebra in Chapter 4.) 
When we take partial derivatives of the vector field V, we can express these 


derivatives in terms of the local frame {t1, t2,...,t,}. We have 
vw afr avi OU; 
—— Via, | = - yiat 
Ox' Ox 2 ? cnr Ox? on 2 Ox? 


In order to proceed and find the component functions of av / Ox", we need to de- 
compose Ot; /Ox' into its components with respect to {%1,%2,...,%n}. This leads 
to the collection of n? functions Pf, defined as 


Then 
) a) Ce Ou; avi “ oA nk 
Ox? Ori J Ss Oxi on ue = ’ (s rit) 
j=l j=l k=1 
n nm k n 
= LVii= > | w+ Sorkv i 
e=1 k=1 j=1 k=1 VM j=l 


Hence, because we work in variable frames, the kth component of the vector field 
OV /dz* is not just OV" /Ox', but rather 


a k n 
(22) =m ae ae, (2.11) 


Equation (2.11) will reappear in a more general context in the analysis on man- 
ifolds. In that context, the collection of functions Ij, are called the components of 
a connection. 


Example 2.1.4 (Spherical Coordinates, 1). We illustrate how to calculate the 
I’, functions for the normalized spherical coordinate frame. Consider the variable 
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frame {€,, 9, €,} and use (2.8) where instead of t, we use p, 0, and y successively 
for the derivatives, i.e., for 7 = 1,2,3. Thus, with k representing the row and j 
representing the column, we have 


0 0 0 0 -sing 0 00 -1 
ri,=(0 0 O|,T$,=|[sing 0 cosp|,T¥,=[0 0 0 
0 0 0 0 -cosp 0 1 0) 20 


Example 2.1.5 (Spherical Coordinates, 2). The previous example used the nor- 
malized frame for spherical coordinates. We could also use the basis described by 
(2.9), which consists of the three vectors 


oF 
ay = a = (cos @sin y, sin @ sin y, cos y) 
/p 
, or suche . 
ta = 99 = (—psin @ sin y, pcos @ sin y, 0) 
3 or : . : 
We = (pcos @ sin y, psin 6 cos y, —psin vy). 
p 


Calculating the rk components requires us to take derivatives of each of the above 
vector functions with respect to each of the coordinates and then decompose back 
into the basis {t1, tz, 3}. Because these three vectors are orthogonal, though not 
unit vectors we find the components of a vector in this frame by 


oUt SU 
v= > +d St +: SU 
U1 ° U1 U2 * U2 U3 ° U3 


The calculations are straightforward and we leave it as an exercise to prove that 


0 0 0 0 —psin? yp 0 0 —p 
1 
and Ue : rs, = ; 0 cot y r= 0 coty 0 
0 0 7 0 —sinycosy 0 p 0 


We have chosen to list the functions with fixed 7 since this is the variable with 
respect to which we take the derivative. However, it is interesting to organize the 
functions into three matrices, each corresponding to a fixed k. We get 


0 0 0 ir ot 0 0 0 1 

0 in? 0 r={1 90 rf r?.= | 0 si 0 
—psin” ~p ta | cot p pe : — sin y cos p ; 

0 0 —p 0 coty 0 7 0 0 


each of which is a symmetric matrix. 


PROBLEMS 
2.1.1. Prove Equation (2.8) for the rate of change of the spherical coordinates frame. 
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Figure 2.3: Coordinate planes for parabolic coordinates. 


2.1.2. Let * : I + R® be a smooth curve in space. Express 7’ and 7” in terms of 
functions of spherical coordinates p(t), 6(t), y(t), and the local frame {€, &, & }. 


2.1.3. Calculate the ry, functions for the spherical coordinate frame as decribed in Ex- 
ample 2.1.5. 


2.1.4. Fix a positive real number a. Elliptic coordinates on R? consists of the pair (ju, v) 
with w > 0 and 0 < v < 27, connected to Cartesian coordinates by 


x = acosh cos v 
y = asinh psinv. 


(a) Prove that the curves of constant jy form ellipses; and that curves of constant 
vy form hyperbolas. 


(b) Calculate 07/0 and Or/Ov and observe that the elliptic coordinate system 
is an orthogonal system. 


(c) Show that the scale factors are hy, = hy = av/cosh? p — cos? v and calculate 
€, and é,. 


(d) Calculate the eight connection functions Ts, for i,j,k = 1,2 associated to 
the frame {é,, &,}. 


48 2. Variable Frames 


2.1.5. The parabolic coordinates system of R* consists of the triple (u,v, 9), with u € 
[0, +00), v € [0, +00), and @ € [0, 27) with equations 


x= uvcos8, 
y = uvsind, 


z= 3(u?—-v’). 


These equations are also called the transition functions from parabolic to Carte- 
sian coordinate. (Figure 2.3 shows the three coordinate “planes” for parabolic 
coordinates in R® passing through the point P € R® with coordinates (u,v, 0) = 


(1,1/2, 1/4).) 
(a) Find the basis vectors for the associated frame according to (2.9) and show 
that the parabolic coordinate system is an orthogonal coordinate system. 


(b) Consider also the basis {@u, &), €} given by 


«= 5/lah *= s/o *= d/l 


Calculate the rate of change matrix for this frame similar to (2.8) as done 
for spherical coordinates. 


(c) Calculate the I’, connection functions associated to the {é.,év, 9} frame 
of parabolic coordinates. 


2.1.6. Toroidal coordinates in R® are denoted by the triple (7, ¢) and transform into 
Cartesian coordinates via 


sinh T 
x= ——— cos ¢ 
cosh T — cosa 
sinh tT 
ng 


= —_____ si 
y cosh T — cosa 
sing 


cosh T — cosa 
typically used with -7 <a <7,0<7, andO0<¢< 27a. 
(a) Show that surfaces of constant o are spheres of center (0,0, cot a) and radius 


csco; that surfaces of constant 7 are tori with the z-axis as the axis of 
rotation; and that surfaces of constant ¢ are planes through the z-axis. 


(b) Find the frame associated to this coordinate system and show that this 
coordinate system is an orthogonal system. 

(c) Show that scale factors are he = h,; = 1/(cosht — cosa), and hg = sinht/ 
(cosh tT —coso); and calculate the associated orthonormal frame {é,, é-, €4}- 


(d) Calculate the rate of change matrix for {€,,é€,,@,} similar to (2.8) as done 
for spherical coordinates. 
2.1.7. Consider the coordinate system on R* that employs the pair (s,a) € [0,-+00) x 
[0, 277) to represent the point on the ellipse 
2 
x Oo 
7a ame 


that lies on the ray from the origin and through (cos a, sin a). 


2 
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(a) Determine change of coordinate system equations from and to Cartesian 
coordinates. 
(b) Find the set B of basis vectors for the associated frame according to Equation 
(2.9). 
Or Or 3. : : 
(c) Prove that — - —— = =ssin(2qa) and conclude that this coordinate system 


Os Oa 8 


is not orthogonal. 


(d) Calculate the rate of change matrix associated to this frame B. 


2.2 Frames Associated to Trajectories 


In the study of trajectories, whether in physics or geometry, it is often convenient 
to use a frame that is different from the Cartesian frame. Changing types of frames 
sometimes makes difficult integrals tractable or makes certain difficult differential 
equations manageable. In the particular context of special relativity, one talks about 
a momentarily comoving reference frame, abbreviated to MCRF.[50] 

In the study of plane curves, it is common to use the frame {7,0} to study 
the local properties of a plane curve Z(t). (See [5, Chapter 1].) The vector T(t) 
is the unit tangent vector T(t) = 2’(t)/||z’(t)||, and the unit normal vector U(t), 
is the result of rotating T(t) by 7/2 in the counterclockwise direction. This is a 
moving frame that is defined in terms of a given regular curve Z(t) and, at t = to, 
is viewed as based at the point Z(to). To compare with applications in physics, it 
is important to note that the {7', 7} frame is not the same as the polar coordinate 
frame {é,.,@}. From Equation (2.4) (and ignoring the z-component), we know that 


€, = (cos 0, sin 0) and €9 = (—sin0, cos 6). 


Assuming that x, y, r, and @ are functions of t and since x = rcos@ and y = rsin0, 
we have 


£'(t) = (a'(t),y'(t)) = (7 cos — r 6’ sin 6, r’ sind + rf! cos 0) = r'é, + r6'Ep. 
We then calculate the speed function to be 
s(t) = 2 QI = V(r’)? + 76")? 
and find the unit tangent and unit normal vectors to be 


no I> I> 
f= Janta ee Tae (r'é, + 10'€), 


a I> Io 
U= Ft aa! r0é€, +r &). 
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Therefore, the orthogonal matrix 


1 rr! 
V(r)? + r2(6)2 ro or’ 


is the transition matrix between the {T',U} basis and the {é,, é} basis. 
Parenthetically, it is now not difficult to obtain a formula for the plane curvature 
of Z(t) in terms of the functions r(t) and @(t). We use either of the formulations 


and we find that 


—rr" + 77(0')3 + 2(r')20' + rr” 


; (2.12) 
((r')? 4 72 (9”)2)°/ 


Kg(t) = 


In general, a frame F in R® that varies with respect to a parameter t consists of 
a quadruple of vector functions (a(t), €1(t), €2(t), é3(t)). The vector function a(t) 
is a curve that traces out the motion of base (or origin) of the frame F and the 
set of vector functions {é1(t), €2(t), é3(t)} are linearly independent for all t. We 
are not constrained to only consider frames in which {é1(t), €2(t), é3(¢)} form an 
orthonormal set for all t, but we will make that assumption for the remainder of 
this section and we will assume in addition that this basis is a positively oriented 
basis, i.e., it satisfies €; x €> = €3. Now for all t, 


op dye =D 
ee; = 
ee lo, fb G, 


so by a dot product rule, 


0 ifi=j 
an oe 2.13 
is eee fi A ee) 


Let F = (@, €1, 2, €3) be a moving positive orthonormal frame. Consider the 
vector function Q(t) defined by 


O = (€- &3)&1 + (Ej + G1 )e2 + (Ef - Ga) e. (2.14) 
Using (2.13), it is easy to check that é/ = Q x &; for all i and for all t. 


Definition 2.2.1. The vector function ((t) is called the angular velocity vector of 
the moving frame F. 
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We now consider a particle following a trajectory Z: J > R® and we propose to 
determine the perceived position, velocity, and acceleration vectors in the moving 
frame F in terms of the true position, velocity, and acceleration. Label () 7, (z’)s, 
and (z”’) as the perceived position, velocity, and acceleration vectors. First, 


(f)p =#-4. (2.15) 


However, the perceived velocity and acceleration of # are obtained by taking the 
derivatives of the components of (%)¢ in {€1, €2, €3}. More explicitly, 


(Z)r = ((Z— G)- €1)e1 + ((H — G) - €a)e2 + ((H— a) - ses, 
(fr “(@- a) a)é a “(@- a zy) Bk “((@- @) és) é, (2.16) 


(@"\r = G(@-@)-A)a+ “(@- a) &)é + = ((#—- a) -&) es. 


We can now relate the perceived position, velocity, and acceleration in the mov- 
ing frame F to the actual position, velocity, and acceleration. By (2.15), 


= (L)r +a. 


Then for the velocity, 


cp Os Pee i 
" = S(@)r + (Gx Bz) +" 
_d hy Meee oO 4g tps ah 
= 5 (S3e-@) é)é) + GX Or +Gx Or +4 
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where the second-to-last term follows from Equation (2.17). Thus, 
#" = (#")r +20 x (#)r +0! x (@e + Ox (Ox (#)F) +2". (2.18) 
All of the above terms have names in physics (see [22, p. 118]): 


e (z”)¢ is called the perceived acceleration or acceleration with respect to F; 
e 20 x (')F is the Coriolis acceleration; 
e Ox (O x (@)z) is the centripetal acceleration; 


e 0! x (#)¢ is sometimes called the transverse acceleration because it is per- 
pendicular to the perceived position vector (2); 


e a” is the translational acceleration of the frame. 


The above discussion described the moving frame F in terms of some abso- 
lute (unmoving) frame. Though an absolute frame arises naturally in the mental 
framework of Cartesian coordinates, to assume the existence of an absolute frame 
in physical systems poses serious challenges. We may think of a point fixed to the 
Earth as the origin for an absolute frame, but taking into account that the Earth 
moves around the Sun, and the Sun moves around the galaxy and so on should 
disqualify this choice. Using Newton’s second law of motion as a reference, classical 
mechanics defines an inertial frame as one in which the motion of a particle not 
subject to any forces travels in a straight line. 

Now suppose that we have identified one inertial frame 7F, and we consider 
another (moving) frame Fp. From (2.18), F2 will also be an inertial frame if and 


it 


only if (@"), = (@”)z, for all trajectories Z(t). This implies that © = 0 and that 
&” = 0, expressed in reference to F,. Hence the unit vectors in Fy do not move 
with reference to the basis vectors of F, and the origin of F2 moves with a constant 
velocity vector in reference to the frame F;. 

Admittedly, the problem in practice of finding one inertial frame leads to a 
vicious circle. How do we know we have found a body free of external forces? We 
can only content ourselves with finding a frame in which Newton’s laws of motion 
hold to a “satisfactory” degree. [21] 

As an example of the application of differential geometry of curves to physics, 
we consider the notion of centripetal acceleration of a curve and its relation to the 
Frenet frame. 


Example 2.2.2 (Centripetal Acceleration of Curves). One first encounters cen- 
tripetal acceleration in the context of a particle moving around on a circle with 
constant speed v, and one defines it as the acceleration due to the change in the 
velocity vector. Phrasing the scenario mathematically, consider a particle moving 
along the trajectory with equations of motion 


Z(t) = (Reos(wt), Rsin(wt)), 
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where R is the radius of the circle and w is the (constant) angular speed. The 
velocity, speed and acceleration are, respectively, 


£"(t) = (—Rw? cos(wt), —Rw” sin(wt)). 


Hence, the acceleration is 


ye 


#"(t) = —w° Z(t) = —w? RE, = Re (2.19) 


where ; is the unit vector in the radial direction (see Equation (2.4)). This is the 
centripetal acceleration for circular motion, often written G,. 

The angular velocity vector Q is the vector of magnitude w that is perpendicular 
to the plane of rotation and with direction given by the right-hand rule. Thus, 
taking k as the direction perpendicular to the plane, we have in this simple setup 
Q = wk. Setting the radial vector R= ré,, it is not hard to show that for this 
circular motion, 

@,=2x (Ax R), 
as expected from Equation (2.18). 

Now consider a general curve in space #: I > R°, where J is an interval of 
R. We recall a few differential geometric properties of space curves. The derivative 
&'(t) is called the velocity and s’(t) = ||Z’(t)|| is called the speed. The curve <(t) 
is called regular at t if #’(t) £ 0. At all regular points of a curve, we define the 
unit tangent as T(t) = £/(t)/||#/(t)||. Because T(t) is a unit vector for all t, T(t) 
is perpendicular to T(t). 

The curvature of the curve is the unique nonnegative function «(t) such that 


T(t) = s'(t)K(t) P(t) (2.20) 


for some unit vector P(t). The vector function P(t) is called the principal normal 
vector. Finally, we define the binormal vector function B(t) by B = T x P. In so 
doing, we have defined an orthonormal set cE ; P, B } associated to each point of the 
curve #(t). This set {T', P, B} is called the Frenet frame. 

It is not hard to show that, by construction, the derivative B ‘(t) is perpendicular 
to B and to T. We define the torsion function T(t) of a space curve as the unique 
function such that 

B'(t) = —s'(t)r(t) P(t). (2.21) 


Finally, from Equations (2.20) and (2.21) that 


P'(t) = —s'(t)«(t)T(t) + 5'(t)r(£)B(t). (2.22) 
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Figure 2.4: Center of curvature and osculating circle. 


We summarize (2.20), (2.21) and (2.22) as 


dj = 44 fo. 
ras P B)=(T P B)[s'« O —-s'r]. (2.23) 
0 s/t 0 


(The above paragraphs only give the definitions of the concepts we will use 
below. A full treatment of these topics can be found in [5, Chapter 3].) 

Since a space curve is not necessarily circular, one cannot use Equation (2.19) 
to determine the centripetal acceleration of x. Instead, we view 2 in relation to 
an appropriate moving frame in which centripetal acceleration makes sense. The 
osculating circle is the unique circle of maximum contact with the curve £(t) at any 
point t, and hence, the appropriate frame F¥ is based at the center of curvature 


and has the vectors of the Frenet frame re, P, B} as its basis. Figure 2.4 depicts 
a space curve along with the center of curvature @(t) and the osculating circle 
associated to a point Z(t) on the curve. 

By Equation (2.14), the angular velocity vector of F is 


> = > 


Q=(P’. B)T+(B'.T)P+(f". P)B=s'tT + 5'B. 
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T he relative postion vector for the curve z with reference to its center of curvature 
isR= (Z)- = —LPp, Therefore, the centripetal acceleration is 


a ee eee 2 2 ics 
=x (Bx RB) = Gx ((s'rF + s'xB) x (-=P)) 


= (s'rT' + s'«B) x (s'T — 8’ B) = (s')?xP + (s')? BP 
K 


2 2 
Ee 2 (2.24) 

It is interesting to note that if a curve happens to be planar, then 7 = 0, and 
the centripetal acceleration becomes @, = (s’)?«P, which matches Equation Q. 19) 
exactly since s’ = v and « is the reciprocal of the radius of curvature, 1/R. However, 
Equation (2.24) shows that, for a curve in space, the “corkscrewing” effect, measured 
by 7, produces a greater centripetal acceleration than does simply rotating about 
the same axis. (Hence, on a rollercoaster a rider will experience more centrifugal 
force — the force that balances out centripetal acceleration — if the rollercoaster 
corkscrews than when it simply rotates around with the same radius of curvature.) 


K 


We finish this section with a classical example of how using a useful moving 
frame renders equations of motion tractable. 


Example 2.2.3 (Radial Forces). As an application of cylindrical coordinate sys- 
tems, we can study Newton’s equation of motion applied to a particle under the 
influence of a radial force. By definition, a force is called radial if F(r) = f(r)é,, 
that is, if the force only depends on the distance from an origin and is parallel to the 
position vector 7. (The force of gravity between two point objects and the electric 
force between two charged point objects are radial forces, while the magnetic force 
on a charged particle is not.) 
Newton’s law of motion produces the following vector differential equation: 


mr" = f(r)é, 


In order to solve this differential equation explicitly, one needs the initial position 
#9 and the initial velocity Uo. 

For convenience, choose a plane P that goes through the origin and is parallel 
to both 7 and tp. (If 7 and & are not parallel, then this information defines a 
unique plane in R°. If 7 and vp are parallel, then any plane parallel to these vectors 
suffices.) Consider P to be the xy-plane, choose any direction for the ray [Ox) and 
now use cylindrical coordinates in R°. 

For radial forces, Newton’s law of motion written in the cylindrical frame as 
three differential equations is 


é: mr” —r(6’)?) = f(r), 
€o: m(2r'6’ +r”) = 0, : (2.25) 
€,: 0=0. 
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Obviously, since 7p and Up lie in the plane through the origin and parallel to é, and 
€9, the equations show that 7(t) never leaves the xy-plane. Thus z(t) = 0. 

We can now solve the second differential equation in the above system to obtain 
a relationship between the functions r and @. First write 


or gi! 
rT OO 
Integrating both sides with respect to t, we then obtain 21n |r| = — In |6’|+C, where 


C is some constant of integration. Taking the exponential of both sides, one obtains 
the relationship r?6’ = h where h is a constant. In terms of the initial conditions, 
we have 


and therefore, for all time t, we have 


es 


h = (fo X Uo): &:. 


Thus, we conclude that the quantity L = 7 x (mv) = m(F x @), which is called the 
angular momentum and in general depends on t, is a constant vector function for 
radial forces. 

Finally, to solve the system in Equation (2.25) completely, it is convenient to 
substitute variables and write the first equation in terms of u = 1/r and 6. Since 
r =1/u, we have 


dr _  ldu_ 1 dédu _ du 
dt  —u? dt u2dtdo@ = d@" 
The second derivative of r gives 
ar d sdu d?u dé du 
dt? aap) do? dt © ae” 


where the last equality follows from fact that r26’ is the constant h. The first part 
of Equation (2.25) becomes 


@u 1 a7 
ae mhz! ) 


(2.26) 


If the radial force in question is an inverse-square law (such as the force of gravity 
and the electrostatic force caused by a point charge), then the radial force is of the 
form 


fr)= - = —ku’. 


In this case, Equation (2.26) becomes 


au 2 k 
dé? ~ mh?" 
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Techniques from differential equations show that the general solution to this equa- 
tion is ; 
u(@) = —> + Ccos(@ — 6), 


mh2 


where C' and 4 are constants of integration that depend on the original position 
and velocity of the point particle under the influence of this radial force. In polar 
coordinates, this gives the equation 


1 
—*, + Ccos(@ — 0) 


r(6) = (2.27) 


PROBLEMS 
2.2.1. Provide the details for the proof of Equation (2.12). 
2.2.2. Prove that Equation (2.14) is the correct vector to satisfy é] = Q x & for all i. 


2.2.3. Determine the transition matrix between the cylindrical coordinate frame and the 
Frenet frame. 


2.2.4. Calculate the curvature and torsion of a space curve defined by the functions, in 
cylindrical coordinates, (r,0,z) = (r(€), O(t), z(t)). 

2.2.5. Determine the transition matrix between the spherical coordinate frame and the 
Frenet frame. 


2.2.6. Calculate the curvature and torsion of a space curve defined by the functions, in 
spherical coordinates, (r, 6, ¢) = (r(t), A(t), d(0)). 

2.2.7. Determine the transition matrix between the parabolic coordinate frame and the 
Frenet frame (see Problem 2.1.5). 


2.2.8. Consider the solution r(@) in Equation (2.27). Determine h, C, and 69 in terms 
of some initial conditions for position and velocity r(0) and v(0). Prove that 
for different initial conditions and different values of the constants, the locus of 
Equation (2.27) is a conic. State under what conditions the locus is a circle, 
ellipse, parabola, and hyperbola. 


2.2.9. (ODE) Find the locus of the trajectory of a particle moving under the effect of 


a radial force that is an inverse cube, i.e., f(r) = —k/r®. [Hint: There are three 
separate cases depending on whether mh? > k, mh? =k, or mh? < k.] 


2.3. Variable Frames and Matrix Functions 


In the preceding sections, we often described the rate of change of variable frames 
using a matrix function. Equations (2.1), (2.6), (2.8), and (2.23) established a 
matrix formula to describe the rate of change of the frame vectors for Cartesian, 
cylindrical, spherical, and Frenet frames respectively. This section generalizes this 
perspective for variable frames. 
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By a matria function, we mean a function F : U + Miyxn(R), where U is 
an open subset of R?. Identifying the set of m x n matrices Mm xn(R) with 
the Euclidean space R™”, the analysis on multivariable functions developed in 
Chapter 1 applies. A single variable matrix function can be viewed as a curve 
y:I—> Mmxn(R), where I is an interval of R. As with any parametrized curve, 
the derivative y'(t) is the m x n matrix of derivatives of component functions of 
7(¢). 

Proposition 2.3.1. Let y;(t) and y2(t) be matriz functions defined over an interval 
I, and let A be any constant matrix. Assuming the operations are defined, the 
following identities hold: 


1, a A) is the 0-matria of the same dimensions of A. 


Q 
bh 
as) 
BR 
ot 
~~" 
II 
hb 
ned 
fa 
g 
3 
a 
Sa 
os 
2 
BR 
i, 
~~" 
a 
~~ 
II 
~ 
m 
~~" 
= 


ala Sila Bla Sl 
~~ 
2 
fa 
— 
cH 


2 
3 
4. 
5 

6. If y(t) is invertible for all t, then 4 (4 (t)~+) = —9,(t)- y(t) (t)?. 
Proof. (Left as exercises for the reader.) 


A particularly useful matrix function involves the exponential of matrices. Let 
A be a p X p matrix and let ¢ € R?. Consider the sequence {%,,}°°.) of vectors 
defined by 


where by A°, we mean the identity matrix J. We prove that this sequence is a 
Cauchy sequence. Denoting |A| by the matrix norm of A, we have 


n 


S- Ata 


n 


1 - 
<> dll 


Zn _ Em = 


k=m-+1 k=m+1 

“1 ae 
< S lal lial < > Gli lel, 
k=m-+1 k=m+1 


where the last inequality follows since all the terms are nonnegative. Then since 
(k —m—1)!/k! < 1/(m+1)!, we have 
co 
ll@n — Small <|Al™Ia 
k=m+1 
| Ajo ak [Apert 


1 : 
< a All < —+—_ || |A| 
Sn pS gl P< apap, 


(k —m-—1)! 1 
k! (k —m-—1)! 


Ape 
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where we substituted 7 = k —m—1. For any positive real number | A], the limit 
of |A|™*1/(m + 1)! as m > oo is 0. Hence, for any positive ¢, there exists N large 
enough so that m,n > N implies that 


lee 


Uv |A| < 
e€ on 


This establishes that the sequence {Z,}°2.) is a Cauchy sequence. Since R? is a 
complete metric space, we conclude that this sequence converges. Since the sequence 
of vectors converges for all v, we conclude that series of matrices in the following 
definition converges. 


Definition 2.3.2. Let A be an n x n matrix. We define the exponential of A as 
As) 
k} 
k=0 


Proposition 2.3.3. Let A and B be two matrices that commute, 1.e., satisfying 
AB = BA. Then 


Proof. (Left as an exercise for the reader.) 


This proposition allows us to conclude the following interesting result. 


Proposition 2.3.4. For all A € Myyn(R), the exponential matrix e4 is invertible. 
Proof. a matrices A and —A commute. Hence, by Proposition 2.3.3, e4e~4 = 
e4-A — @® — J. Thus e4 has an inverse. 


Now let A € M,xn(R) and consider the matrix function y(t) = e4*. Note first 
that y(0) = J. The derivative of y(t) is 


d 1 ld 
14) — kyk ) _ kak kppk—l 
Koerae nat) y aa (A*t") “5 Atk kt 


k! 
& e ft ghye-t —J4 se i gee = Aett 
- £4 (k-1)! 7 (k —1)! = 


In particular y/(0) = A. 

We now connect these concepts to moving frames. 

Any frame ¥ of R” consists of an origin and a basis (ti, U2,...,Un) of R”. Since 
the basis consists of n linearly independent vectors, the matrix 
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where the ith column is the vector t;, is an invertible matrix. Similarly, for any 
invertible n x n matrix M, the columns form a basis of R”. If F is a moving frame, 
we can view the vectors of this moving frame as a matrix function M(t) where M(t) 
is invertible for all t. More precisely, M is a matrix function M : I > GL,(R), 
where J is an interval of R and GL,(R) = {A € Mnxn(R)| deta 4 0} is the set of 
invertible matrices, also called the general linear group. 

For any moving frame F with basis vectors (t@,(t),...,Un(t)), we can express 
each derivative w/(t) as a linear combination of these same basis vectors at a given 
t. As we did with Equations (2.1), (2.6), (2.8), and (2.23), if we consider the matrix 
function M(t) = (w(t) ... d,(t)), then this decomposition can be expressed as 


M'(t) = M(t)A(t) 
for some matrix A(t). 


Proposition 2.3.5. Let B © Mn y,(R) be arbitrary. There exists a variable frame 
with matrix function M : I > GL,(R) with rate of change matria A(t) satisfying 
M'(t) = M(t)A(t) such that A(to) = B for some to € I. 


Proof. Consider M(t) = e? (to) = ete Bto, We note that M(to) = e® = In. 
By the differentiation property of the matrix exponential, M’(t) = Be?'e~Pto = 
Be®—to). Thus A(to) = M(to)A(to) = M'(to) = Be® = B. 


Many of the examples that we discussed in the previous section involved or- 
thonormal variable frames. An orthonormal basis in R” is any n-tuple of vectors 
(t11, Ui2,...,Un) such that 


lL ifia4 
Pe ce (2.28) 
0, ift Aj. 
Then using as usual M = (ii tig ++ in) we note that the ijth entry of M'M 


is precisely the dot product tw; - uj. Consequently, the vectors form an orthonormal 
frame if and only if M'M = In, i.e., M is an orthogonal matrix. 

Since det(M') = det(M), an orthogonal matrix M satisfies det(M) = +1. An 
orthonormal basis (v1, %2,...,Un) is called positively oriented if det(M) = 1 and 
negatively oriented if det(M) = —1. The set of orthogonal n x n matrices is denoted 
by O(n), and the set of positive orthogonal matrices is denoted by 


SO(n) = {M € O(n)| det(M) = 1}. 


Both O(n) and SO(n) have a group structure, a property not discussed in this book, 
and are respectively called the orthogonal group and special orthogonal group. 
(Note that the order of the basis vectors in the n-tuple (tw, U2,...,Un) matters 
since a permutation of these vectors may change the sign of the determinant of the 
corresponding matrix MM. Consequently, we must talk about an n-tuple of vectors 
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as opposed to just a set of vectors. One should also be aware that a permutation 
of vectors in the basis B = (dt), t2,...,U,) would lead to another basis B’ which 
consists of the same set of vectors but has a coordinate transition matrix which is 
a permutation matrix.) 

We can therefore view an orthonormal moving frame in R” as a map M: I > 
O(n), where I C R is an interval. In this case, if M is continuous, then det(M(t)) 
is a continuous function from I to {—1,1}. Consequently, det(M(t)) is either 1 
or —1 for all t. When det(M(t)) = 1, we say that M corresponds to a positive 
orthonormal moving frame, and we view M as a function M : I + SO(n). The sets 
of matrices O(n) and SO(n) have the subset topology induced from the Euclidean 
topology on R™. Consequently, the notions of continuity and differentiability of a 
moving frame are familiar notions. 


Proposition 2.3.6. Let I C R be an interval, and let M : I > O(n) be a differen- 
tiable function. Then the matrix function A(t) defined by 


M'(t) = M(t)A(t) 


is antisymmetric for allt € I. Furthermore, for any antisymmetric matriz B, there 
exists a matrix function M : I — O(n) such that A(to) = B for some to € I. 


Proof. Since M(t) € O(n), we have M(t)' M(t) = I, for all t, and similarly 
M(t)M(t)' =I,. Hence, using the differentiation rules, 


0 = M’(t)M(t)' + M()-(M(t)") = M'(t)M(t)' + M(t)(M"()'. 


Thus M(t)(M’(t))' = —M’(t)M(t)' so after multiplying on the right by (M(t)-1)" 
and on the left by M(t)~1, we get 


(M'(t))" (M(t)1)" =-—M'(Q)M (0)? 
=> (M(t)"!M’(t))' =—M(t)"!M"(¢). 


However, from the definition of the matrix function A(t), we have A(t) = M(t)~'M'(t). 


Hence, we deduce that A(t)' = —A(t) and therefore that A(t) is antisymmetric. 
For the second part of the proof, let B be antisymmetric, i.c., B' = —B, and 
consider the matrix function M(t) = e?(-'), Then 
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Hence, M(t) is orthogonal for all t € I. Then M’(t) = BeP(-*) = BM(t) so 


and the result follows. 


M'(to) = BM(to) = BI = B, 


The four cases that motivated this section, namely Equations (2.1), (2.6), (2.8), 
and (2.23), illustrate the first part of Proposition 2.3.6. In all of those examples, 
the A(t) matrix function was an antisymmetric matrix for all t. 


PROBLEMS 

2.3.1. Prove Proposition 2.3.3. 

2.3.2. Let J be a constant n x n matrix and let y : (—e,¢) — GL,(R) be a differen- 
tiable matrix function, where « > 0. Suppose that (0) = In and suppose that 
7(t) | Jy(t) = J for all t € (—e,e). Prove that the matrix 7’(0) satisfies 

(0) ' J = —J4(0). 

2.3.3. Find an example of an n x n-matrix function y(t) such that f’(¢) is never 0, where 
f(t) = det(y(t)), but such that det(y‘(t)) = 0 for all t. 

2.3.4. Let y : I — R™*” be a differentiable matrix function, and let f : J ~ I bea 
differentiable function. Prove the chain rule for matrix functions, namely, 

d 
4 VFO) = FO) FO. 
2.3.5. Let A(t) and B(t) be two n x n matrix functions defined over an interval J C R. 
(a) Suppose that A(t) and B(t) are similar for all ¢ € J. Prove that A’(t) and 
B'(t) are not necessarily similar. 
(b) Suppose that A(t) and B(t) are similar for all t € J and that A(to) = AT. 
Prove that A’(to) and B’(to) are similar. 
(c) Suppose that A(t) and B(t) are similar in that B(t) = SA(t)S~ for some 
fixed invertible matrix S. Prove that A’(t) and B’(t) are similar. 

2.3.6. Prove Proposition 2.3.1. 

2.3.7. Suppose that 7 : J > GLn(R) be a matrix function and let A(t) = e7™™. Is it true 
that A’(t) = e7(t). 

2.3.8. Let A be a diagonalizable matrix with A = PDP, where D is diagonal with 


Mm O - 0 
0 XA. es O 
D= , 
0 O «+ An 
and A; € R for all 7. Prove that 
et 9 0 
0 ert 0 
ett _p Po 
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2.3.9. Show that if 


then 
jit coswt —sinwt 
~ \sinwt  coswt })* 
2.3.10. Let @ € R® be a nonzero vector. 


(a) Show that for any # € R°, we can write the cross product as the matrix 


product 
0 —W3 W2 
ox = W3 0 —wi |] 
—-W2 Wy 0 


(b) Call W the 3 x 3 matrix in the above expression. Prove that e™® is the 
matrix of rotation about the axis with direction @ and with angle ||||¢. 


2.3.11. We define SL,(R) = {A € Mnxn(R)| det(A) = 1} and call this set the special 
linear group. Suppose that M : (—e,¢) > SL,(R) with M(0) = In, the identity 
matrix, and suppose that M'(t) = M(t)A(t) for all t € (—e,¢). Prove that the 
trace of A(0) is Tr A(0) = 0. [Recall that the trace of a matrix is the sum of its 
diagonal elements. Hint: Use the definition of the determinant that if M = (mij), 
then 


det(M) = ys (sign 7)M4169(1)M26(2) “+*™Mno(n)> 
cESn 


where S;, is the set permutations on n elements.| 


2.3.12. This exercise gives an interesting property about the derivative of determinants 
of square matrices of functions. Let A = (a:;(t)) be an n x n matrix of functions. 


(a) Use the formula for the determinant given in the previoius exercise to show 


that 
da;; 
q tet A) 1)'? det(A;;) —* 
oA) = 2 7-1)" de Au) 


i=1 j=1 
where A;; is the ijth minor of A. 


(b) Conclude that if A is a symmetric matrix, then 


n 


d 
(det A) = (det A) ae 


i,j=l 


i Ui 


dt ’ 


where the a” are the entries of the inverse matrix A~!. 
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CHAPTER 3 
Differentiable Manifolds 


In previous geometry or calculus courses, we studied curves and surfaces as subsets 
of some ambient Euclidean space R”. We defined parametrizations as vector func- 
tions of one (for a curve) or two (for a surface) variables into R? or R?, without 
pointing out that many of our constructions relied on the fact that R? and R® are 
topological vector spaces. That we have only studied geometric objects that are 
subsets of R? does not belie our intuition since the daily reality of human experi- 
ence evolves (or at least appears to evolve) completely in three dimensions that we 
feel are flat. However, both in mathematics and in physics, one does not need to 
take such a large step in abstraction to realize the insufficiency of this intuition. 

In geometry, one can easily define natural point sets that cannot be properly 
represented in only three dimensions. For example, the real projective plane RP? 
can be defined as the set of equivalence classes of lines through the origin in R® or 
also as the set of equivalence classes of points in R?—{(0,0, 0)} under the equivalence 
relation 


(£0, 21, £2) a (Yo. Y15 Y2) if and only if 
(Yo, Y1; Y2) aa (Axo, Av1, Av2) for some A € R— {O}. 


The projective plane plays a fundamental role in geometry, and also in topology 
and algebraic geometry. From the above construction, it appears that the projec- 
tive plane (as its name suggests) should be a two-dimensional object since, from 
a topological viewpoint, it is the identification space (see Definition A.2.44) of a 
three-dimensional object by a one-dimensional object. Both in classical geometry 
and in algebraic geometry, there exist natural methods to study curves on the pro- 
jective plane, thereby providing a language to “do analysis” on the projective plane. 
Nonetheless, it is not hard to show that no subset of R? is homeomorphic to RP?. 
There does exist a subset of R* that is in fact homeomorphic to RP? but this fact 
is not obvious from the definition of the projective plane. Consequently, to pro- 
vide definitions that include projective spaces and other more abstract geometric 
objects, we must avoid referring to some ambient Euclidean space. 
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In physics, the need for eliminating a Euclidean ambient space boasts a more 
colorful history. Inspired by evidence provided by scientists like Toricelli, explorers 
of the 15th and 16th centuries debunked the flat-earth theory by circumnavigating 
the globe. Though the normal Euclidean geometry remained valid on the small 
scale, namely, doing geometry on a flat surface (sheet of paper or plot of land), such 
methods no longer sufficed when considering the geometry of the earth as a whole. 
In particular, the science of cartography suddenly became far more mathemati- 
cal in nature as navigators attempted to represent, with some degree of accuracy, 
coastlines of continents on a flat sheet of paper. 

No less revolutionary was Einstein’s theory of general relativity in which both 
space and time are connected as a single, four-dimensional space-time entity that 
could itself be curved. In fact, following from the postulate that nothing with mass 
travels faster than the speed of light, Einstein’s theory purports that mass must 
distort space-time. 

The practical need to do geometry or do physics in continuous point-set spaces 
that are not Euclidean leads us to generalize our concepts of curves and surfaces 
to higher-dimensional objects. We will call these objects of study differentiable 
manifolds. We will then define maps between manifolds and establish an analysis 
of maps between differentiable manifolds. Our definitions, which may seem a little 
weighty, attempt to retain sufficient restrictions to ensure that doing calculus on 
the sets is possible, while preserving enough freedom to incorporate the rich variety 
of geometric objects to which we wish to apply our techniques. 


3.1 Definitions and Examples 


As a motivating example for differentiable manifolds, we recall the definition of a 
regular surface in R® (see [5, Chapter 5] for more background). 


Definition 3.1.1. A subset S C R? is a regular surface if for each p € S, there 
exists an open set U C R?, an open neighborhood V of p in R®, and a surjective 
continuous function X : U > VS such that 
1. X is differentiable: if we write X(u,v) = (x(u,v),y(u,v),z(u,v)), then the 
functions x(u,v), y(u,v), and z(u,v) have continuous partial derivatives of all 
orders; 


2. X is a homeomorphism: X is continuous and has an inverse X~!: VNS + U 
such that X~1! is continuous; 


3. X satisfies the regularity condition: for each (u,v) € U, the differential 
AX (u,v) : R? — R® is a one-to-one linear transformation. 


This definition already introduces many of the subtleties that are inherent in 
the concept of a manifold. In the above definition, each function X:U3VNSis 
called a parametrization of a coordinate neighborhood. 

Now, as we set out to define differentiable manifolds and remove any reference 
to an ambient Euclidean space, we begin from the context of topological spaces. 


3.1. Definitions and Examples 
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(Appendix A gives a brief introduction to topological spaces.) Not every topo- 
logical space can fit the bill of usefulness for differential geometry, so we require 
some additional properties of what types of topological spaces we will consider. 
We first impose the requirement of having a cover of open sets, each of which is 
homeomorphic to an open set in a Euclidean space. 


Definition 3.1.2. A topological manifold of dimension n is a Hausdorff topological 
space M with a countable base such that for all « € M, there exists an open 
neighborhood of z that is homeomorphic to an open set of R”. 


The reader is encouraged to refer to Section A.2 in the appendices for defini- 
tions and discussions about the base of a topology and the Hausdorff property. A 
topological space that has a countable base is called second countable. The techni- 
cal aspect of this definition attempts to define a category of objects as general as 
possible, while still remaining relevant for geometry that generalizes that on R”. 

In the definition of a topological manifold, a given homeomorphism of a neigh- 
borhood of M with a subset of R* provides a local coordinate system or coordinate 
patch. As one moves around on the manifold, one passes from one coordinate patch 
to another. In the overlap of coordinate patches, there exist change-of-coordinate 
functions that, by definition, are homeomorphisms between open sets in R” (see 
Figure 3.1). However, in order to define a theory of calculus on the manifold, these 
functions must be differentiable. We make this clear in the following definition. 


Definition 3.1.3. A differentiable manifold M of dimension n is a topological 
manifold along with a collection of functions A = {¢, : Ua > R”}aer with Ug 
open in M called charts, satisfying 
1. For each chart, é4(Ua) = Va is open in R” and dg : Ua > Va is a homeomor- 
phism; 


2. The collection of sets U,, called coordinate patches, cover M, i.e., 


MM =.) 0a 
acl 


3. For any pair of charts ¢, and ¢z, the change-of-coordinates 


def 


Pap = a ° $5 |be(UarU) : bpUa a Ug) —« Pa(Uo a Us), 

called the transition function, is a function of class C! between open subsets 

of R”. 
The collection of functions A = {¢a}aer satisfying the above conditions is called 
an atlas. 

A differentiable manifold is called a C* manifold, a smooth manifold, or an 

analytic manifold if all the transition functions in the atlas are respectively C*, 
C@™, or analytic. 
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Figure 3.1: Change-of-coordinate maps. 


A few comments about notation are in order here. Mimicking the notation 
habits for common sets (Euclidean space R”, or the n sphere as S”), if M is an n- 
dimensional differentiable manifold, we sometimes shorten the language by referring 
to the “differentiable manifold M”.” Also, though technically a chart is a function 
@:U — R”, where U is an open subset of the manifold M/, one sometimes refers to 
the chart (U, ¢) to emphasize the letter to be used for the domain of ¢. Though we 
use a single letter to designate a differentiable manifold, the atlas A is an essential 
part of the definition; consequently, we sometimes refer to the differentiable manifold 
as the pair (M”,.A) to indicate the letter we are using to designate the atlas. 
Finally, since the domains U, of the charts cover M, they satisfy the condition of 
a topological manifold that each « € M must have an open neighborhood that is 
homeomorphic to an open set in R”. 

At first pass, the definition of a differentiable manifold may seem unnecessarily 
complicated. However, this definition removes any reference to an ambient space, 
a feature whose virtues we discussed in the introduction to this chapter. After all, 
from a geometric perspective, this is the safe thing to do: a priori we do not know 
whether a given manifold can be described as a subset of an ambient Euclidean 
space. The application to general relativity also gives a compelling reason: in 
general relativity, the universe is a spacetime whole that is not Euclidean, sometimes 
called “curved.” However, it would be misleading to think of this curved spacetime 
as a subset of a larger Euclidean space. Removing any reference to an ambient space 
is the proper approach to presenting a mathematical structure that appropriately 
models a non-Euclidean space in which we wish to do calculus. The above definition 
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Figure 3.2: Stereographic projection. 


and subsequent constructions have proven general enough and structured enough 
to be useful in geometry and in physics. 


Many properties of manifolds that arise in analysis are local properties, in that 
we only need to know information about the manifold M in some neighborhood of 
a point p € M. When this is the case, we can restrict our attention to a single coor- 
dinate chart dg : U, — R”, where p € U,. Saying that the coordinates of a point p 
(with respect to this chart) are (z!,xz?,...,2”) means that ¢o(p) = (x, x?,..., 2”). 
For reasons that will only become clear later, it is convenient to follow the tensor 
notation convention of using superscripts for coordinates. This makes writing poly- 
nomial functions in the coordinates more tedious but this notation will provide a 
convenient way to distinguish between covariant and contravariant properties. 


Example 3.1.4 (Sphere). Consider the unit sphere S? = {(z,y,z) € R°|a?+y?+ 
2? = 1} and call N = (0,0,1) the North pole and call S$ = (0,0,—1) the South 
pole. We define the stereographic projection from the North pole N as the function 
my : S?— {N} > R?, where 7y(p) is the intersection of the line (Np) with the 
xy-plane. (See Figure 3.2). The definition for 7g, the stereographic projection from 
the South pole, is similar. 


In Exercise 3.1.1, we prove the following results. The formula for stereographic 
projection: 
x 


1. from the north pole is ty (a, y, z) = (=. ae 
—z’1l-z 


J and 


2. from the south pole is 7s(z,y,z) = ae re: 
z Zz 
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The inverses of stereographic projection are not hard to find either. In particular, 


tea) ( 2u 2u u =) ena 
= Ni 

nae ur +241) ur+o241) urt+o241/’ 

4 2u 2u wtvu2—-1 

7 (u,v) = ( ; ; ) 

ev wtu2t+1 wtv?tl? uwtu241 


The domain of my is Uy = S? — {N} and the domain of 7g is Us = S? — {S}, so 
these domains do cover the sphere S?. As fractions of polynomials, my and Ty are 
both continuous, so 7y is a homeomorphism. The same holds for zg. 

For the transition function TsO; we note first that the domain is ty (UnNUsg). 
Since ty (S) = (0,0), we have ty(Un NUs) = R? — {(0,0)}. Furthermore, it is not 


hard to show that i . 

TOT (u,v) = (=e (3.1) 
By repeated application of the quotient rule, any repeated partial derivative of 
either component function of mg 0 ie is a polynomial in u and v divides by a 
power of u? + v?. Since the domain of mg 0 7x" is R? — {(0,0)}, all of these partial 
derivatives exist and are continuous. Thus, 75 0 Ty is C®. The same thing occurs 
for tw 0 m;'. Hence, the set {7,7} provides an atlas that equips S? with the 
structure of a smooth manifold. 


Example 3.1.5 (Sphere with Another Atlas). We can prove that the unit sphere 
S? is a smooth two-dimensional manifold using another atlas, this time using rect- 
angular coordinates for the parametrizations. 

Consider a point p = (x,y,z) € S?, and let V = {(u,v)|u? +0? < 1}. If 
z > 0, then the mapping Xi) : V — R® defined by (u,v, 1 — u2 — v?) is clearly 
a bijection between V and S?M {(z,y,z)|z > O}. X 1) is also a homeomorphism 
because it is continuous and its inverse X Ga) is simply the vertical projection of 
the upper unit sphere onto R?, and since projection is a linear transformation, it is 
continuous. 

We cover S? with the following parametrizations X «) : VR: 


if z>0, Xay(u,v u,v, V1—u? — v?), 


S=* = 
if2<0, Xe (u,v) = (u,v, -V1 — u2 — v2), 
if y > 0, X(3 (u,v) = (u, /1 — v2 — v2, 0), 
ify <0, Xia (u,v) = (u, —V/1 — u2 — v?,v), 
ifz>0, Xs (u,v) = (V1 —u? — v?, u,v), 
) 


= (-V1—u? — v?, u,v). 


Figure 3.3 depicts an expanded view of these coordinate patches. The inverses for 
each of these parametrizations give coordinate charts ¢; = X, @ :U; — Vi, which 


ifx <0, X@(u,v 


together form an atlas on the sphere. 
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Figure 3.3: Six coordinate patches on the sphere. 


We notice in this case that all V; = {(u,v)|u? + v? < 1}. Also, not all U; 
overlap; in particular, Uy MN Uz = 0, U3 NU, = 0, and Us N Us = @. To show that 
the sphere equipped with this atlas is a differentiable manifold, we must show that 
all transition functions are C1. We illustrate this with 43, = 3 0 ree 

The identification (t, 0) = 3 0 ob, '(u, v) is equivalent to 


(a, V1 — tv? — 07,8) = (u,v, V1 — u? — v2). 
This leads to 


(u,v) |u2 + v2 <1 and v > 0}, 
(a, 0) |a? + 0? <1 and o> 0}, 


and 
o3i(u,v) = (u, V1 — u? — v?). 


It is now easy to verify that 31 is of class C! over ¢1(U; M U3). In fact, higher 
derivatives of $3; involve polynomials in wu and v possibly divided by powers of 
V1l—u?—v?. Hence, over ¢1(Ui U3), the function ¢31 is of class C™. It is 
not hard to see that all other transition functions are similar. Thus, this atlas 
A = {¢;}_, equips S? with the structure of a smooth manifold. 


Example 3.1.6 (Projective Space). The n-dimensional real projective space RP” 
is defined as the set of lines in R"*! through the origin. No two lines through the 
origin intersect any place else and, for each point p in R”, there exists a unique line 
through the origin and p. Therefore, one can describe RP” as the set of equivalence 
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classes of points in R"*! — {(0,0,...,0)} under the equivalence relation 


(oy Piya tin) Uy YigsA 5 Un) if and only if 
(Yo; Y1;--+;Yn) = (Avo, AX1,...,AXn) for some A € R — {0}. 


We designate the equivalence class of a point (%0,21,...,%n) by the notation (xo : 
12...:%p). The set RP” is a topological space with the quotient topology coming 
from the quotient map 7 : R"*+! — {0} > RP” given by (x0, %1,.--,%n) = (Xo: 
21 :...:@p). Since R"*! — {0} is second countable (has a countable base, namely 
the open balls in R"*+! — {0} of rational radius with centers of rational coordinates), 
the quotient space RP” inherits a countable base. 

General theorems in topology quickly establish that RP” is Hausdorff but we give 
a direct proof here. Call O = (0,0,...,0) in R"*!. For a> 0 and A € R"*! — {0}, 
define C,,(A) as the double open cone 


C,(A) = {B € R"*1| ZAOB < a or ZAOB > 1 — a}, 


with axis of revolution (OA) and opening angle of 2a. For all a, the cone C,,(A) is 
an open subset of R”+?. 

Let p,q € RP” be distinct points. Let p; € 7~+(p) and let q, € 7~!(q) such that 
Zpi0q, < 7/2. Since p # q, the angle Zp,Oq, is positive. Define a to be an angle 
with0O<a< 52p10q. Then Ca(p1) NA Ca(qi) = 0. 

Call U = m(Ca(pi)) and V = 7(Ca(qi)). Since Ca(pi) and C(qi) are open, the 
topology on RP” is defined so that U and V are open in RP”. Furthermore, p € U 
and qg € V. Also, 


m (UNV) =a) Na *(V) = Ca(pr) 1 Ca(a1) = 8, 


where the middle equality holds because we used cones, namely unions of lines 
through O. However, the function 7 is surjective, so we deduce that UNV = 9. 
Since p and q were arbitrarily chosen, we deduce that RP” is Hausdorff. 


We can define an atlas on RP” as follows. Note that if (%o,a71,...,%n) ~ 
(Yo; Y1,---;Yn), then for any i, we have x; = 0 if and only if y; = 0. For i € 
{0,1,...,n}, define U; = {(a9 : v1: ...: &n) € RP” |x; 4 0} and define ¢; : U; > 
R” by 

(fo 2X12... 8n) = (F354), 


where the @ notation indicates deleting that entry from the (n + 1)-tuple. It is 
easy to see that each ¢; is a homeomorphism between U; and R”. Furthermore, 
Up U--- UU, includes all ratios (a : ... : @,) for which not all 2; = 0. Thus, 
Up U--- UU, = RP”. 

So far, we have established that RP” has the structure of a topological manifold 
and we have given it natural charts. We need to show that the transition functions 
between coordinate patches are differentiable. 
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Assume without loss of generality that i < 7. Then ¢;(U;NU;) = {(a1,...,an) € 
R” |a; A O} and ¢;(U; NU;) = {(a1,..-,an) € R” |aiz1 A O}. (The apparent 
difference comes from i < j.) Then the change-of-coordinate function ¢; 0 ¢; ' is 


Oped. (iyi 7Oy)— Oi Oe eps 1 Oat ens SOQ) 
_ (= a2 ay 1 Qi+1 a; <n) 
— pg — sansa pe hie . 
Gg. Oy aj Aaj aj aj aj 
WY 
(j+1)th 


Note that we remove the (j + 1)th entry from an (n + 1)-tuple labeled, whose first 
index is 1. 

It is not hard to see that ¢; 0 ¢;' is indeed a bijection between ¢;(U; N U;) 
and ¢;(U;U;). Furthermore, all higher partial derivatives of ¢; 0 ¢;" exist over 
bi(U; U5). 

The same reasoning works if 7 > 7. Therefore, this atlas satisfies the condition 
required to equip RP” with the structure of a smooth manifold. 

We point out that it is possible to define RP” in a slightly different way. Consider 
the unit sphere S” as a subset of R"+! and consider the antipodal function A : 
S”" — S” defined by A(p) = —p. We can define RP” as S” where antipodal points 
are identified. In other words, projective space is the set of equivalence classes of 
antipodal points 

RP” = {{p,—p}|p € S"}. 


We define the projection 7 : S? + RP? as the function 7(p) = [p], where [p] = 
{p, —p} is the equivalence class. This function helps define the topology on RP? 
(see Section A.2.3) but it is not as simple to define the manifold structure of RP? 
from this quotient map. 


Before providing more examples, we must emphasize a technical aspect of the 
definition of a differentiable manifold. If M is a topological manifold not inherently 
defined as the subset of a Euclidean space, we do not study whether M is or is not a 
differentiable manifold, but rather, we discuss whether it is possible to equip M with 
an atlas that equips it with the structure of a differentiable manifold. Also, as we 
saw in the above examples, since the domains of the charts cover M, these domains 
provide the open neighborhoods for each point that occur in the last condition of a 
topological manifold. 


Definition 3.1.7. Two differentiable (respectively, C*, smooth, analytic) atlases 
{¢.} and {u;} on a topological manifold M are said to be compatible if the union 
of the two atlases is again an atlas on M in which all the transition functions are 
differentiable (respectively, C*, smooth, analytic). 


Interestingly enough, not all atlases are compatible in a given category. It is also 
possible for the union of two atlases of class C* to form an atlas of class C’, with 
1<k. The notion of compatibility between atlases is an equivalence relation, and 
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an equivalence class of differentiable (respectively, C*, smooth, analytic) atlases is 
called a differentiable (respectively, C*, smooth, analytic) structure. Proving that a 
given topological manifold has a unique differentiable structure or enumerating the 
differentiable structures on a given topological manifold involves techniques that 
are beyond the scope of this book. For example, in [29], published in 1963, Kervaire 
and Milnor prove that S’ has exactly 28 nondiffeomorphic smooth structures. 


Example 3.1.8. We point out that for any integer n > 1, the Euclidean space R” 
is an n-dimensional manifold. (The standard atlas consists of only one function, 
the identity function on R”.) 


Example 3.1.9. A manifold M of dimension 0 is a set of points with the discrete 
topology, i.e., every subset of M is open. The notion of differentiability is vacuous 
over a 0-dimensional manifold. 


Note that this example indicates that a manifold is not necessarily connected 
but may be a union of connected components, each of which is a manifold in its 
own right. 


Example 3.1.10 (An Alternate Smooth Structure on R). Let M = R and consider 
the function w : M —> R defined by ~(x) = x3. The function w is a homeomorphism 
so the singleton set {7} forms an atlas on R. The standard structure on R, as 
described in Example 3.1.8, uses the atlas {¢}, where d6: M > R is d(x) = a. 
However, though {¢} and {7} define smooth structures on R, these two atlases are 
incompatible. Consider the function ¢ 0 7~'(a) = %/x. It is a homeomorphism but 
it is not differentiable at 0. Hence, {¢,w} is not a differentiable atlas, let alone a 
smooth one. 


Example 3.1.11 (Open Subsets of Manifolds). Let M” be a differentiable manifold 
with atlas A = {¢q : Ua > R"}aer. Let V be an open subset of M. Consider the 
set of functions A’ = {daly : Ua NV > R"}aer. We have 


Juv = (U | AV =Mav av 


ael ael 


Hence U,V, for a € I, cover V. Because ¢,, is a homeomorphism with its image 
ga(Ua), then da(UaNV) is open in R” and daly is also a homeomorphism. Finally, 


-1 = -1 
galv ° (¢alv) |oe(UanUenv) = ba ° bp | o6(UanUanvy? 


which is the same class of function as ¢y ° a Hence, if V is any open subset of 
M, then it inherits the same class of structure of M. 


When working with examples of manifolds that are subsets of R*, it is often 
easier to specify coordinate charts x : U C M” > R” by providing a parametriza- 
tion a~! : 2(U) + M” that is homeomorphic with its image. Since the chart is a 
homeomorphism, this habit does not lead to any difficulties. 
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Figure 3.4: A square as a differentiable manifold. 


Example 3.1.12. Consider a trefoil knot K in R?. One can realize K as the image 
of the parametric curve 


y(t) = ((2 + cos(3t)) cos(2t), (2 + cos(3t)) sin(2t), sin(3t)) 


for t € R. We can choose an atlas of K as follows. Set one coordinate patch on 
K to be U; = 7((0,2m)) and another patch to be Uz = 7((7,3m)). Use as charts 
the functions ¢, and ¢2, which are the inverse functions of y : (0,27) > K and 
7: (1,37) > K, respectively. Now 


o2(U, 9 U2) = (7, 277) U (27, 377) and $1(U, A U2) = (0, 7) U (x, 27), 


and the coordinate transition functions are 


a t, if t € (7,27), 
t = 
~1 ° 2 ( ) fa 27, ifte (27, 37), 
. t+27, ift € (0,7), 
= ; 
$20, (t) ‘i if t € (7, 27). 


Both of these transition functions are differentiable on their domains. This shows 
that K, equipped with the given atlas, is a 1-manifold. 


From the previous example, it is easy to see that any regular, simple, closed 
curve in R* can be given an atlas that gives it the structure of a differentiable 1- 
manifold. Our intuition might tell us that, say, a square in the plane should not be 
a differentiable 1-manifold because of its corners. This idea, however, is erroneous, 
as we shall now explain. 


Example 3.1.13. Consider the square with unit length side, and define two chart 
functions as follows. The function ¢; measures the distance traveled as one travels 
around the square in a counterclockwise direction, starting with a value of 0 at 
(0,0). The function ¢2 measures the distance traveled as one travels around the 
square in the same direction, starting with a value of 1 at (1,0) (see Figure 3.4). 


76 


3. Differentiable Manifolds 


q 


Figure 3.5: Not a bijection. 


Figure 3.6: The Klein bottle. 


The functions ¢; and ¢2 are homeomorphisms, and the coordinate transition 
function is 
10d)": (1,4)U (4,5) > (0,1) U (1,4), 


with 
L, if « € (1,4), 


A ee 
$1°¢) (x)= ig if x € (4,5). 


This transition function (and its inverse, the other transition function) is differen- 
tiable over its domain. Therefore, the atlas {¢1,¢2} equips the square with the 
structure of a differentiable manifold. 


This example shows that, in and of itself, the square can be given the structure 
of a differentiable 1-manifold. However, this does not violate our intuition about 
differentiability and smoothness because one only perceives the “sharp” corners 
of the square in reference to the differential structure of R?. Once we have the 
appropriate definitions, we will say that the square is not a submanifold of R? 
with the usual differential structure (see Definition 3.6.1). In fact, the atlases in 
Examples 3.1.12 and 3.1.13 bear considerable similarity, and, ignoring the structure 
of the ambient space, both the square and the knot resemble a circle. We develop 
these notions further when we consider functions between manifolds. 

It is not hard to verify that a regular surface S$ in R® (see Definition 3.1.1) is 
a differentiable 2-manifold. The only nonobvious part is showing that the proper- 
ties of coordinate patches of a regular surface imply that the coordinate transition 
functions are differentiable. We leave this as an exercise for the reader (see Problem 
3.1.6). 

Parametrized surfaces that are not regular surfaces provide examples of geomet- 
ric sets in R? that are not differentiable manifolds. For example, with the surface 
in Figure 3.5, for any point along the line of self-intersection, there cannot exist an 
open set of R? that is in bijective correspondence with any given neighborhood of 
p. However, the notion of a regular surface in R® has more restrictions than that of 
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a 2-manifold for two reasons. Applying the ideas behind Example 3.1.13, a circular 
(single) cone can be given the structure of a differentiable manifold even though it 
is not a regular surface. Furthermore, not every differentiable 2-manifold can be 
realized as a regular surface or even as a union of such surfaces in R*. A simple 
example is the Klein bottle, defined topologically as follows. Consider a rectangle, 
and identify opposite edges according to Figure 3.6. One pair of sides is identified 
directly opposite each other (in Figure 3.6, the horizontal edges), and the other pair 
of sides is identified in the reverse direction. 

It is not hard to see that the Klein bottle can be given an atlas that makes it 
a differentiable 2-manifold. However, it turns out that the Klein bottle cannot be 
realized as a regular surface in R?. 

We end the section by defining the product structure of two manifolds. 


Definition 3.1.14. Let M™ and N” be two differentiable (respectively, C”, smooth, 
analytic) manifolds. Call their respective atlases {@a}aer and {wg}gey. Consider 
the set M x N that is equipped with the product topology. If ¢: U > R™ is a chart 
for M and w: V > R” is a chart for N, then define the function ¢ x W:U x V > 
R™*” by @ x U(p1, p2) = (P(p1), Y(p2)). The collection {¢a x Wa}(a,8)erxs defines 
a differentiable (respectively, C’, smooth, analytic) structure on M x N, called the 
product structure. 


Consider, for example, the circle S! with a smooth structure. The product 
S!xSt is topologically equal to a (two-dimensional) torus, and the product structure 
defines a smooth structure on the torus. By extending this construction, we define 
the 3-torus as the manifold T? = S! x S! x S! and inductively the n-torus as 
T*=7T*-! ~ §}. 


PROBLEMS 


3.1.1. Stereographic Projection. One way to define coordinates on the surface of the 
sphere S? given by x” + y? + z? = 1 is to use the stereographic projection of 
na: S*—{N} > R?, where N = (0,0,1), defined as follows. Given any point 
p € S?, the line (pN) intersects the xy-plane at exactly one point, which is the 
image of the function 7(p). If (x,y,z) are the coordinates for p in S’, let us write 
(x,y,z) = (u,v) (see Figure 3.2). 


(a) Prove that my (x,y,z) = (7%, 74). 
(b) Prove that 
Gi =i 2u 2v w+oy2—1 
i oat Ser ee eRe EEE 


=I U U 
(c) Show that tg o7y (u,v) = (a tana =): 
3.1.2. Consider the n-dimensional sphere S” = {(a1,...,an41) € R°** |af+---+a2 4) = 
1}. Exhibit an atlas that gives S” the structure of a differentiable n-manifold. 
Explicitly show that the atlas you give satisfies the axioms of a manifold. 
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3.1.3. 


3.1.4. 


3.1.5. 


3.1.6. 


3.1.7. 


3.1.8. 


3.1.9. 


3.1.10. 


3.1.11. 


Let V be an open set in R*, and let f : V + R™ be a continuous function. Find 
an atlas that equips the graph of f, defined as the subset 


G ={(q, f(e)) eR ™ | 2 € V}, 
with the structure of a smooth k-manifold. 


Describe an atlas for the 3-torus T? = S! x S' x S!. Find a parametric function 
X :U > R’, where U is a subset of R°, such that the image of X is a 3-torus. 
We revisit Example 3.1.10. Let ¢o(x) = x be the identity map on R. The atlas 
{¢o} equips R with its usual differentiable structure. Let ¢1 : R — R defined by 
¢i(z) = 2? + 2. Prove that ¢1 is a homeomorphism and conclude that {¢1} is a 
differentiable atlas on R. Prove that {do} and {¢1} are compatible atlases. 
Prove that a regular surface in R*® (see Definition 3.1.1) is a differentiable 2- 
manifold. 


Consider the following two parametrizations of the circle S' as a subset of R?: 


X1(t) = (cost, sint) for t € (0,27), 
l-u? .2u 
Yi (u) Coe a, forwe R. 


Find functions X2 and Y2 “similar” to X1 and Yj respectively, to make {X1, X2} 
and {Yi, Y2} atlases that give S! differentiable structures. Show that these two 
differentiable structures are compatible. 
Consider the real projective plane RP’. The atlas described for RP? has three co- 
ordinate charts. Calculate explicitly all six of the coordinate transition functions, 
and verify directly that $i; = 5; 
Consider S? to be the unit sphere in R?. Consider the parametrizations 

f: (0,20) x (0,7) 3S’, with f(u,v) = (cosusiny, sin usin v, cos v), 

g: (0,27) x (0,7) +S’, with g(u,0) = (—costsin3, cosd, sin %sind). 
We have seen that f is injective and so is a bijection onto its range. 


(a) Find the range U of f and the range V of g. 

(b) Determine f~'(a,y,z) and g~'(x,y,z), where (x,y,z) € S’. 

(c) Show that the set of functions {(U, f~+), (V,g~')} forms an atlas for S? and 
equips S* with a differentiable structure. 


Let M” be a topological manifold and let D(/) be the collection of atlases on M 
that equip M with a differentiable structure. Prove that the relation of compati- 
bility is an equivalence relation. [Recall that two atlases A and B are compatible 
when AUB € D(M).] 


Let M” be a topological manifold and let D(M) be the collection of atlases on 
M that equip M with a differentiable structure. Consider the partial order of 
containment C on D(M). Show that every chain (totally ordered subsets) of 
D(M) has an upper bound. Use Zorn’s Lemma to conclude that (D(M), C) has 
maximal elements. [Some authors reserve the expression differentiable structure 
for these maximal elements in D(M).] 
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M 


Figure 3.7: Differentiable map between manifolds. 


3.2 Differentiable Maps between Manifolds 


From a purely set-theoretic perspective, it is easy to define functions between man- 
ifolds. Since differentiable manifolds are topological manifolds to begin with, we 
can discuss continuous functions between manifolds just as we do in the context 
of topology. However, a differential structure on a manifold expressed by a spe- 
cific atlas, allows us to make sense of the notion of differentiable maps between 
manifolds. 


Definition 3.2.1. Let M™ and N” be differentiable (respectively, C’, smooth, 
analytic) manifolds. A continuous function f : M™ — N” is said to be differentiable 
(respectively, C*, smooth, analytic) if for any chart y : V — R” on N and for any 
chart x: U — R™ on M, the map 


yofog +:2(UNf-'(V)) CR™ — y(V) CR” (3.2) 


is a differentiable (respectively, C*, smooth, analytic) function. (See Figure 3.7.) 
We denote by C*(M™, N”) the set of C*-differentiable maps from M to N. 


In the above definition, the domain and codomain of yo f o a~! may seem 


complicated, but they are the natural ones for this composition of functions. 

It follows from this definition that a function between two manifolds cannot 
have a stronger differentiability property than do the manifolds themselves. (See 
Exercise 3.2.9.) In particular, if M and N are C’-differentiable manifolds, we cannot 
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discuss functions of class C*+! or higher between them. Restricting attention to 
smooth manifolds, removes this concern. 

In linear algebra, we do not care about all functions between vector spaces but 
only linear transformations because, in an intuitive sense, linear transformations 
“preserve the structure” of vector spaces. Furthermore, two vector spaces V and W 
are considered the same (isomorphic) if there exists a bijective linear transformation 
between the two. In the same way, in the category of differentiable manifolds where 
we only consider differentiable (or perhaps C* or smooth) maps between manifolds, 
we consider two manifolds the same if they are diffeomorphic. 


Definition 3.2.2. Let M and N be two differentiable manifolds. A diffeomorphism 
(respectively, C* diffeomorphism) between M and N is a bijective function F : 
M + N such that F is differentiable (respectively, C*) and F~! is differentiable 
(respectively, C*). If a diffeomorphism exists between M and N, we say that M 
and N are diffeomorphic. 


Example 3.2.3. Consider the projection map 7 : S? > RP? that identifies antipo- 
dal points on the unit sphere 


(X,Y, 2) = (ei y: 2) 


for any (x,y,z) € S?. For S?, we use the atlas {7,75} as presented in Exam- 
ple 3.1.4, and for RP”, we use the atlas in Example 3.1.6, namely, ¢; : U; > R? for 
0<i< 2, with 


To XO Beal , Beal 
LQ V1 
and $2(Xo 1X4 £2) = (=. =) (3 4) 
v2 % 
For each pairing of coordinate charts we have 
2u 2u u2+vu2—1 
ei _ 
oT oty MY) = 6i(-5 ty2t+] w+tyt+1 w+? > 


and 


uU 2u 


with domain {(u,v) € R?|u 4 0}. In all six cases, the resulting functions are 
differentiable on their domains and in fact smooth. This shows that the projection 
map 1 : S? + RP? is smooth. 
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Example 3.2.4. Similar to the real projective space RP”, we can also define the 
complex projective space CP” as follows. Define the relation ~ on nonzero (n + 1)- 
tuples in C+! by (29, 21,---,2n) ~ (Wo,W1,--.;Wn) if and only if there exists 
nonzero A € C such that w; = Az; for 0 <i <n. This relation is an equivalence 
relation, and the complex projective space CP” is the set of equivalence classes, 
written as CP” = (C”+! — {(0,...,0)}) / ~ in the notation of quotient sets. We 
write (Zo : 21: +++: Zn) for the equivalence class of (zo, 21,-.-,Zn)- 

The stereographic projection my of the sphere onto the plane sets up a homeo- 
morphism h : CP! — S? defined by 


Roos ty (21/20), if z) £0, 
ices (0,0, 1), if z =0. 


Note that if z9 4 0, then there is a unique z’ such that (zo : 21) = (1, z’), namely, 
z' = 2/20, and that if zo = 0, then (zo : 21) = (0: z) for all z 40. Therefore, one 
sometimes says that CP’ is the complex plane C with a “point at infinity,” where 
this point at infinity corresponds to the class of (0 : 2). The function h is a bijection 
that maps the point at infinity to the north pole of the sphere, but we leave it as 
an exercise for the reader to verify that this function is indeed a homeomorphism. 

Complex analysis studies holomorphic (i.e., analytic) functions. This notion is 
tantamount to differentiable in the complex variable. Any holomorphic function 
f :C — C defines a map py : S* > S? by identifying R* with C and 

an my 0 f otN(Q) if g 4 (0,0, 1) 
AINE VOT) if q = (0,0,1)' 
(That py must send (0,0, 1) to (0,0,1) follows from a theorem in complex analysis, 
namely Liouville’s Theorem.) 

Consider S? as a differentiable manifold with atlas {ty,7s}, with coordinates 
(u,v) and (u,&) respectively, as described in Example 3.1.4. It is interesting to 
notice that, according to Example 3.1.4, the change-of-coordinates map 75 0 Ty 
corresponds to z ++ 1/Z over C — {0}, where Z is the complex conjugate of z. 

Take for example f(z) = z?. The associated function py leaves (0,0,—1) and 
(0,0,1) fixed and acts in a nonobvious manner on S?. According to Definition 
3.2.1, in order to verify the differentiability of py as a function S?  S*, we need to 
determine explicitly the four combinations 


(xy or Ts) 0 pp O(N OF Tg)! 


and show that they are differentiable on their appropriate domains. 

Setting z = u+ iv, we have 2? = (u? — v?) + (2uv)i. Since we are using the 
stereographic projection from the north pole to define py in the first place, we have 
TN OppOTy (u,v) = (u? —v?, 2uv). Determining the other three combinations, we 
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find that 


TN OpfOTy (u,v) = (u? — v7, 2uv), 


= cs _f( wr-yv 2uv 
TS§ OPf OTN (u, v) =e fe LL v2)?’ (u2 ae mR) 
TN ODF OTs (a,0) = ( 


Ts Oppo, (U,0) = (a? — 07, 2ud). 


It is not hard to show that, with f(z) = z?, the corresponding natural domains of 
these four functions are R?, R? — {(0,0)}, R? — {(0,0)}, and R?. Then it is an easy 
check that all these functions are differentiable on their domain and, hence, that py 
is a differentiable function from S? to S?. 


Since R is a one-dimensional manifold, if MM is a differentiable manifold, we can 
discuss whether a real-valued function f : M — R is differentiable by testing it 
against Definition 3.2.1. Suppose also that p is a point of M and that x: U — R™ 
is a coordinate chart of a neighborhood of p. Then foa~! is a differentiable function 
from the open set z(U) in R™ to R. Then we define the partial derivative of f at 
p in the x* coordinate as 


Of | aet Of 027") 
Ox? . 7 Ox? 


, (3.5) 
x(p) 


The notation on the left-hand side is defined by the partial derivative on the right- 
hand side, which is taken in the usual multivariable calculus sense. 

The notion of a differentiable map between differentiable manifolds also allows 
us to easily define what we mean by a curve on a manifold. 


Definition 3.2.5. Let M be a differentiable manifold. A differentiable curve on 
M is a differentiable function y : (a,b) + M, where the interval (a, b) is understood 
as a one-dimensional manifold with the differential structure inherited from R. A 
closed differentiable curve on M is a differentiable function y : S' + M, where S! 
is the circle manifold. 


PROBLEMS 


3.2.1. Consider the antipodal identification map described in Example 3.2.3. Explicitly 
write out all six functions ¢; 0 f o ae and ¢; 0 fo Ts: Prove that each one is 
differentiable on its natural domain. 


3.2.2. In Example 3.2.4, with f(z) = z”, consider points on the unit sphere S? with 
coordinates (x,y,z) € R®. Express pp on S? — {(0,0,1)} in terms of (2, y, z)- 
coordinates by calculating a" of omn(2,y, 2). 
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3.2.3. Consider the torus T? in R? parametrized by 
X (u,v) = ((2 + cosv) cos u, (2 + cos v) sin u, sin v) 

for (u,v) € [0,27]?. Consider the Gauss map of the torus n : T? > S* that sends 
each point of the torus to its outward unit normal vector as an element of S?. Using 
the stereographic projection of the sphere, explicitly show that this Gauss map is 
differentiable. 

3.2.4. Consider the torus T? parametrized in the same way as in the previous exercise. 
The function X, restricted to (0, Qn)”, gives a homeomorphism 


(a) Prove that the function X, restricted to (0,27)?, gives a homeomorphism 
between an open subset of this torus and an open square in R?. Define 
gd: = X71. 

(b) Show that if we defined ¢2 as the inverse of (u,v) > X (u+ 3,0 + 3) over 
(0,27), and ¢3 as the inverse of (u,v) + X(u+a,u+7) over (0,27), 
then {¢1, ¢2, 3} is an atlas for the torus T?. Show that no subset of this 
atlas is also an atlas of T?. 


(c) Define f : T? > S? in reference to the ¢; chart as 


f (u,v) = (cos usin v, sin usin v, cos v). 


Show that f is well-defined and can be continued continuously over all of 
Te 

(d) Use the stereographic projection atlas {mw,7s} of S* to calculate df, for 
(y1,y2) =m 0 fogy'. 


3.2.5. Consider the (unit) sphere given with the atlas defined by stereographic projection 
A = {an,78} as in Example 3.1.4. Consider the function f : S? > R given by 
f(x,y, z) = z in terms of Cartesian coordinates. 


(a) Show that for points in the sphere in the coordinate chart of my, a formula 
for the partial derivatives of f is 


Of 4u Of Av 


= d — ; 
Ou (u? + v2 +1)? on Ov (uw? +? +1)? 


(b) Writing the coordinates on the ms chart as (u,%), find a formula for the 
partial derivatives OF and of over the coordinate chart 7g. 
Ou Ov 
(c) Explain in what sense these partial derivatives are equal over S?, with the 
poles {(0,0,1), (0,0,-1)} removed. [Hint: Use the chain rule and the Jaco- 


bian matrix from Equation (3.22).] 


3.2.6. Consider the function f : RP? + RP? defined by 


2 
ee) en a) fed al 
f(vo 221: 22:23) = | Vor3 — @1X2: XH — 10X11 X2 : x3 cos | ———3——5—_ : 


mtata+a 
Prove that f is a well-defined function. Prove also that f is a differentiable map. 


3.2.7. Consider the 3-sphere described by S? = {(z1, z2) € C? | |z1|? +|z2|? = 1}. Consider 
the function h : S* > S? defined by h(z1, z2) = (21 : 22) where we identify S? with 
CP! as in Example 3.2.4. (This function is called the Hopf map.) 
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(a) Suppose that z1 = v2 + iyi and z2 = x2 + iy2. Find an explicit formulation 
of h(x1, ye, £2, y2). 


(b) Prove that this function is a smooth map S* > S?. [Hint: Use atlases based 
on stereographic projection.] 


3.2.8. Let f : R”*+ — {0} = R™*1 — {0} be a differentiable map. Let d € Z, and suppose 
that f is such that f (Av) = A“ f(z) for all X € R— {0} and all 2 € R"**—{0}. Such 
a map is said to be homogeneous of degree d. For any x € R*+! — {0}, denote by 
& the corresponding equivalence class in RP*. Show that the map F : RP” > RP™ 
defined by F(Z) = f(x) is well defined and differentiable. 


3.2.9. Let f : M™ — N” be differentiable map between differentiable manifolds. Let 
(Ui,x) and (U2,Z) be overlapping coordinate charts on M and let (Vi, y) and 
(V2, ¥) be overlapping coordinate charts on N. Since we can write 


gofoa®?=(Goyt)o(yofor)o(eoa), 


show why in order for a function f : M > N to be of class C*, both manifolds 
must be C’*-differentiable manifolds. 


3.3 Tangent Spaces 


In the local theory of regular surfaces S C R?, the tangent plane plays a particularly 
important role. We define the first fundamental form on the tangent plane as the 
restriction of the dot product in R? to the tangent. From the coefficients of the first 
fundamental form, one obtains all the concepts of intrinsic geometry, which include 
angles between curves, areas of regions, Gaussian curvature, geodesics, and even the 
Euler characteristic (see references to intrinsic geometry in [5]). The definition of a 
real differentiable manifold, however, makes no reference to an ambient Euclidean 
space, so we cannot imitate the theory of surfaces in R® to define a tangent space 
to a manifold as a vector subspace of some R”. 

From a physical perspective, we often think of a tangent vector to a surface 
SC R® as the velocity vector at p of some curve on S through p. We understand 
this velocity vector to be an element in R°. Since we define manifolds without 
reference to an ambient Euclidean space, simply imagining the notion of a tangent 
vector poses serious conceptual challenges. 

The reader can anticipate that to circumvent this difficulty, we must take a 
step in the direction of abstraction. We identify a tangent vector as a directional 
derivative at a point p of a real-valued function on a manifold M. Furthermore, 
since we cannot use vectors in an ambient Euclidean space to describe the notion 
of direction, we use curves on M through p to provide a notion of direction. The 
following construction makes this precise. 


Definition 3.3.1. Let M™ be a differentiable manifold and let p be a point on M. 
Let ¢ > 0, and let y : (—e,¢) > M be a differentiable curve on M with (0) = p. 
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For any real-valued differentiable f defined on some neighborhood of p, we define 
the directional derivative of f along y at p to be the number 


DA = aFfOM))| _- (3.6) 


The operator D, is called the tangent vector to y at p. 


If 7; and y2 are two curves satisfying the conditions in the above definition, 
then D,, = Dy, if these operators have the same value at p for all differentiable 
functions defined in open neighborhoods of p. 

Note that fo is a function (—e,¢) + R, so the derivative in Equation (3.6) is 
taken in the usual sense. It is also interesting to observe that the above definition 
does not explicitly refer to any particular chart on U. However, in order to calculate 
D,(f) it may be necessary to refer to a chart around p. 

The above definition of a tangent vector may initially come as a source of mental 
discomfort since it presents tangent vectors as operators instead of as the geometric 
objects with which we are used to working. However, any tangent vector (defined 
in the classical sense) to a regular surface S$ in R® naturally defines a directional 
derivative of a function S — R so Definition 3.3.1 generalizes the usual notion of a 
tangent vector (see [5, Section 5.2]). 

As the name “tangent vector” suggests, the set of all tangent vectors forms a 
vector space, a fact that we show now. 

Let U be an open neighborhood of p in M. Call C1(U,R) the set (vector space) 
of all differentiable functions from U to R. A priori, the set of tangent vectors D, 
at pon M is a subset of all operators W = {C!(U,R) — R}. By the differentiation 
properties 


D,(f +9) = Dy(f) + Dy(9) and D,(cf) = cD,(f), 


so D., is a linear transformation from C'(U,R) to R. For readers who are familiar 
with the dual of a vector space, this latter result shows that D, is in the dual vector 
space C1(U,IR)*. (We discuss the dual of a vector space in Section 4.1.) We would 
like to show that the set of tangent vectors is a subspace of C'(U,R)*, i.e., closed 
under addition and scalar multiplication. 

Let y : (—e,¢€) > M be a differentiable curve with 7(0) = p. If we define 
71(t) = y(at), where a is some real number, then using the usual chain rule for any 
differentiable function f € C'(U,R), we have 


(Fov(at))]|,_, =e F))|,_, = aD, (0). 


d 
Dy (f) = _ 


t 
This shows that the set of tangent vectors is closed under scalar multiplication. 

In order to prove that the set of tangent vectors is closed under addition, we 
make reference to a coordinate chart z : U > R™, where U is an open neighborhood 
of p. Without loss of generality, we assume that x(p) = (0,0,...,0). We rewrite 
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the composition fo y = fox -toxoy where roy: (-e,€) 3 R™ and for !: 


a(U) CR™ >R. By the chain rule in multivariable analysis, Theorem (1.3.3), we 
have 


d 
Df) = 4 (F@))) 
d zs 
= Ffoawort)))),_, 
=d(foa)gd(xo7)|,65 
where we evaluate d(f oa~') at 0 = (0,0,...,0) because a(p) = 0. 


Let a and £ be two differentiable curves on M such that a(0) = 6(0) = p. Over 
the intersection of the domains of a and /, define the curve y by 


y(t) = a7" (xo a(t) +20 B(t)). 
Note that (0) = 2~!(x(a(0)) +2(8(0))) = 271(0+0) = 
for any function f : U > R, we have 
Da(f) + Da(f) = d( fox )od(xo a)|r—0 + d(f 0 2~*)gd(a © 8)|t=0 

= d(fox™")g(d(xo a)|1=0 + d(x 0 8)|t=0) 

=d(fo a')gd(a oat+xro Bol ec 

= D,(f). 
Thus, the set of tangent vectors is closed under addition. This brings us in a position 
to prove the following foundational fact. 


t=0 


«~'(0) =p. Furthermore, 


Proposition 3.3.2. Let M be a differentiable manifold of dimension m, and let 
p be a point of M. The set of all tangent vectors to M at p is a vector space 
of dimension m with basis {0/Ox'|i = 1,...,m}, where (x',x?,...,2™) are the 
coordinates on some chart around p. 


Definition 3.3.3. The vector space of tangent vectors is called the tangent space 
of M at p and is denoted by 7,M. 


Proof of Proposition 3.3.2. The prior discussion has shown that the set T,M is a 
vector space. It remains to be shown that it has dimension m. 

Let 2: U + R™ be a system of local coordinates at p. Write x(q) = (x1(q),..., 
x™(q)), and define the coordinate line curve v; : (—e,¢) 4 M by v;(t) = 2 1(0,...,0, 
t,0,...,0) where the t occurs in the ith place. Then 


d a 
Dy, (f) = qf ot '(0,---,0.t0,---,0)) |, 6 = or 


Pp 
according to the notation given in Equation (3.5). We can therefore write, as 


operators, Dy, = 


dat 


Pp 
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For any differentiable curve y on M with y(0) = p, we can then write in coor- 
dinates x o y(t) = (y1(t),..., 7" (t)), where 7? = z*(7(t)). Then 


Dif) = Shor), Dieco 


7 Se Of | a 
= Ox? p dt 


t=0 


This presents the operator D, as a linear combination of the operators 0/ dz" |. 
It is also a trivial matter to show that for 1 < i<m, the operators 0/dz'| are 


linearly independent. Consequently, they form a basis of T,M, which proves that 
dim T,M =m. 


Because the operators 0/0z' occur so often in the theory of manifolds, one often 
uses an abbreviated notation. Whenever the coordinate system is understood by 


context, where one uses x = (x!,..., 2”) or another letter, we write 
def O 


whose explicit meaning is given by Equation (3.5). This notation shortens the 
standard partial derivative notation and makes it easier to write it in inline formulas. 

From our definition of tangent vectors, if the manifold is of class C? we can give 
an alternate characterization of the tangent space T,M. 


Definition 3.3.4. A function from X : C'(M,R) — R is called a derivation of 
C1(M,R) at p if it satisfies 
1. Linearity: X (af + bg) = aX(f)+6X(q) for all f,g € C'(U,R) and a,bE R; 
2. Leibniz’s rule: X(fg) = X(f)g(p) + f(p)X(qg) for all f,g € C(U,R). 


Note that C*(M,R) is an algebra that is, a vector space equipped a “multipli- 
cation” operation that is bilinear over the vector space. So, if k > 1, a derivation 
on C*(M,R) at p is a linear transformation from the algebra of C*(M,R) to R, 
satisfying additionally what is tantamount to a product rule. 


Proposition 3.3.5. Let X be derivation of C'(M,R) at p and f a constant function 
on M. Then X(f) =0. 


Proof. (Left as an exercise for the reader.) 


Theorem 3.3.6. Let M™ be a C?-differentiable manifold. The tangent space T,M 
is the set of derivations of C?(M,R) at p. 


Proof. We have already seen that every tangent vector is derivation so T;,M is vector 
subspace of the set of derivations of C*(M,R) at p. 
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Conversely, let X be a derivation of C!(M,R) at p. Let U be a coordinate 


neighborhood of p with coordinates « = (a!,a?,...,2™) and suppose that the 


coordinates of p are x(p) = (a',a’,...,a™). Without loss of generality, suppose 
that «(U) is an open ball in R™ with radius x(p). For i = 1,...,m, let X(a#") =v’. 


By Theorem 1.3.8, for any function f € C?(M,R), setting 
cj = Of /Ox'(p) = A(f oa~*)/Ox'|,, 
the first-order Taylor series of f at p is 


foa'(a1,m2,...,¢m) = (fo a )(@) + >> aia! — a’) 
i=l 
+ So (gi ° a +)(a1, x, at 2) (a = a‘), 
i=1 


where g; € C!(U,R) with (g; 0 x~)(@) = 0. Since g; are of class C', we can take a 
derivation of it. Then by linearity and the Leibniz rule, 


X(f) = X(f(p)) + DX a)(e' — a’) + ¢(X(x") — X(a’))) 
+ DX (gi)(a" —a')|p + gi(p)(X(a") — X(a’)). 


4 


Then by Proposition 3.3.5 and the assumption that X(a’) = v’, 
X(f) = do cv + D(X (gi)0 + Ov") = SF civ’ 
i=1 i=1 i=l 


Thus X = v'0; + 702 +--+ +v™0m. Since 0; € T,(M), we deduce that the set of 
derivations of C?(M,R) at p is also a subspace of TM. The result follows. 


Example 3.3.7 (Tangent Space of R”). We consider the tangent space for the 
manifold R” itself. We assume the standard differential structure. 

Let p be a point in R”, and let v = (v1,...,Un) be a vector. Consider the line 
traced out by the curve y(t) = p+ tv. We wish to find the coordinates of the 
tangent vector D., with respect to the standard basis of T,M, namely, {0/0x"} or, 
according to the notation of Equation (3.7), {0;}. For any real function f defined 
over a neighborhood of p, we have 


d oe 
D,(f) = fpr + tu1,---)Pn + tn) = = 


t=1 


Ui 
Pp 


So with respect to the basis {0/0z'}, the coordinates of D. are (v1,...,Un). There- 
fore, at each p € R”, the map v++ D, sets up an isomorphism between the vector 
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spaces R” and T,(R”) by identifying 0; with the ith standard basis vector. It is 
common to abuse the notation and view the tangent space T,(R”) as equal to R”. 
Note that if v is a unit vector, the Dy is equal to the directional derivative 
operator in the direction of v. 
More generally, for any differentiable curve y : (—e,¢) > R” with (0) = p, we 


have 
D,(f) = ¥ 
i=1 


Hence, as an operator, we can write 


Of 
Ox? 


5 (0). 


* ) 
w=1 


9 


Pp 


which illustrates D, as a vector with the same components of the usual velocity 
vector 7/(0) given with respect to the basis {0/0z'}. 


Example 3.3.8 (Regular Surfaces). Let S$ be a regular surface in R3. In Chapter 
5 of [5], the authors define the tangent plane to S$ at p as the subspace of R? 
consisting of all vectors y'(0), where y(t) is a curve on S with y(0) = p. The 
correspondence ¥'(0) <+ D, identifies the tangent space for regular surfaces with the 
tangent space of manifolds as defined above. This shows that the present definition 
directly generalizes the previous definition as a subspace of the ambient space R”. 

In multivariable calculus, one shows that given a parametrization X:VcR 
R® of a coordinate patch of a regular surface, if p = X (uo, vo), then a basis for T;,S 
is 

{Xu (uo, vo); Xy (uo, uo) }. 


The definition of the tangent plane given in calculus meshes with Definition 3.3.1 
and Proposition 3.3.2 in the following way. A tangent vector in the classical sense, 
w € T,M, is a vector such that w = 7’ (to), where 7(t) is a curve on S with ¥(to) = p. 
Write 7(t) = X(a(t)), with a(to) = (uo, v9). Writing a(t) = (u(t), v(t)), we have 


@ = ul (to) Xu(uo, vo) + v' (to) Xv (uo, v0): (3.8) 


Now the corresponding coordinate chart x on S' in the language of manifolds 
is the inverse of the parametrization 2 = X~' defined over U = X(V) C S. The 
tangent vector (in the phrasing of Definition 3.3.1) associated to ¥ at p is 


d 


Df) = SUGw)|, = FUR), = FF ea(ay)) 
af 


— £ i Of 
=u (60) 5a, $v (t0) Se, (3.9) 


to to 
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where the partial derivatives — = and sl, are in the sense of Equation (3.5). We 
v 
can write as operators 
3) 3) 
Di =u (t |, t E A 
y= Ut) | +o) (3.10) 


Therefore, we see that the correspondence between the definition of the tangent 
space for manifolds and the definition for tangent spaces to regular surfaces in R® 


2 ) 
identifies X,,(uo, vo) with of and similarly for the v-coordinate. 
U'p 


Obviously, the bases for T;,M described in Proposition 3.3.2 are dependent on 
the coordinate charts. The following proposition shows how to change coordinates. 


Proposition 3.3.9. Let M” be a differentiable manifold; let (U1,¢1) and (U2, 2) 
be overlapping coordinate charts; and let p € Uy MU. Denote by (x*) the co- 
a of 6, and by (#1) the coordinates of ¢2 Let B = {0,,02,...,0,} and 
B = {01,00,...,0n} be the two bases for TpM defined by Proposition 3.3.2 with 
ey to the coordinate systems (where by "6, we mean 0/02"). The coordinate 
change matrix from B to B coordinates on Ty M is d(¢2 0 6; '), the differential of 
the transition function. In other words, for all X ETM, if 


1 = 


v v 
v? ve Ag 
[X]e=|. and [X]|g= then w= ae 
: —< Ox" 
yu” or 


Proof. (Left as an exercise for the reader.) 


PROBLEMS 


3.3.1. Let M be a differentiable manifold. Let p € M and let X € T,M. Prove that if f 
is a constant function, then X(f) = 0. 


3.3.2. Prove Proposition 3.3.9. 


3.3.3. Consider RP? with the usual atlas {¢0,¢1,¢2}. Let (ui,u2) be coordinates cor- 
responding to ¢o and (v1, v2) coordinates corresponding to ¢1. Let p € Uo N U1. 
Calculate the change of coordinate matrix on T,(RP?) from u-coordinates to v- 
coordinates. 


3.3.4. Let M be a differentiable manifold. A class C” (resp. smooth) function element on 
M is a pair (f,U) where U is an open subset of M and f : U > R that is of class 
C* (resp. smooth). Recall that we cannot discuss functions of class C* unless M 
is a C*-differentiable manifold. Given a point p € M, define the relation = on the 
set of function elements with (f,U) = (g,V) whenever p € UNM V and there is a 
neighborhood W of pin UNV such that f|w = glw, ie., the restrictions of f and 
g to W are equal as functions. 


3.4. The Differential of a Differentiable Map 


(a) Fix a k and a point p € M. Prove that = is an equivalence relation. [The 
equivalence class [(f,U)] of some element (f,U), where p € U is called a 
germ at p. The set of all germs at p of class C* functions is denoted by 
Cp(M,R)] 

(b) Prove that the following addition and scalar multiplication 


def 


(AONI+IG.VY) = (F+9,UNV)) and e(f,U)] = [(cf,0)] 


are well-defined and make C#(M,R) into a vector space. 
(c) Prove that the multiplication on C¥(M,R) 


(A UIG.V)1 = (Fg, UNV) 


is associative, has an identity, and distributes over the addition. [This makes 
Ck(M,R) into an associative algebra.] 


(d) Let y: (—e,¢) + M beacurve on M with 7(0) = p. Prove that D,([{(f,U)]) eZ 
D,f is well-defined, i.e., that if (f,U) = (g,V), then D,f = Dyg. 


3.4 The Differential of a Differentiable Map 


Having established the notion of a tangent space to a differentiable manifold at a 
point, we are in a position to define the differential of a differentiable map f : M —- 
N. Recall that in multivariable real analysis, we call a function F : R™ — R” 
differentiable at a point p € R™ if there exists an n x m matrix A such that 


F(p+h) = F(p) + Ah+ R(h), 


where R(h) is a continuous function defined around 0 such that ||R(h)||/||hl] + 0 
as ||h|| + 0. We refer to the matrix A as the differential dF. Surprisingly, given 


our definition of the tangent space to a manifold, there exists a more natural way 
to define the differential. 


Definition 3.4.1. Let F: M™ > N” bea differentiable map between differentiable 
manifolds. We define the differential of F at p © M as the linear transformation 
between vector spaces 


dF, : TpM —> Trp) N, 
D, -—> DFoy- 


The differential dF, is also denoted by F, with p is understood by context. If 
X €T,M, then F,(X) is also called the push-forward of X by F. 


From this definition, it is not immediately obvious that dF, is linear, but, as the 
following proposition shows, we can give an equivalent definition of the differential 
that makes it easy to show that the differential is linear. Figure 3.8 depicts the 
differential of a map between manifolds. 
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(p) 


TM Tr(p)N 
dF, 
iS Droy 
Dy, 


Figure 3.8: The differential of a map between manifolds. 


Proposition 3.4.2. Let F: M —> N be a differentiable map between manifolds. 
Then at eachp € M, the function F,, = dF, satisfies 


F,(X)(g) = X(g0 F) 


for every vector X €T,M and every function g from N into R defined in a neigh- 
borhood of F(p). Furthermore, F, is linear. 


Proof. Let X € T,M, with X = D,, for some curve y on M with 7(0) = p. For all 
real-valued function g defined in a neighborhood of F'(p) on N, 


F(X)(g) = dFy(Dy)(9) = Droy(9) 


= (go Fon)(t)| = Dylgo F) = X(goF). 


To show linearity, let X,Y € 7,M and a,b €R. Then 


F,(aX + bY)(g) = aX(g0 F) + bY (go F) = aF.(X)(9) + 0F.(Y)(9); 


which shows that F;, is linear. 


Note that this definition is independent of any coordinate system near p or 
F'(p). However, given specific coordinate charts « : U > R™ and y: V > R” whose 
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domains are, respectively, neighborhoods of pin M and F(p) in N, with F(U) CV, 
we can define a matrix that represents dF,. Set vu; as the coordinate line for the 
variable x’ in the chart x. In the usual basis of TM, we have 


-.) = Drou; - 


However, for any smooth function g: N > R, 


PD = F, ( 


n 


dg d(yi o Foy) 
DFoy(g) =>. Oy) dt 


We (3.11) 


dg (CV O(y! 0 F) dy 
(>: Ox? dt 


.) (3.12) 


where 7’ = x' 0 y. Therefore, in terms of these coordinate patches, the matrix for 
F, with respect to the standard bases {0/Ox'} on T,M and {0/0y’} on Trp) N is 


en = (>), 


by which we explicitly mean 


with 1<i<m, andl<j<n, (3.13) 


OFI 
Ox? 


def O(y o Foa*) 
a Ox? 


Pp x(p) 


(3.14) 


Example 3.4.3 (Curves on a Manifold). We used the notion of a curve on a 
manifold to define tangent vectors in the first place. However, we can now restate 
the notion of a curve on a manifold as a differentiable map 7 : I > M, where I 
is an open interval of R and M is a differentiable manifold. The tangent vector 
D, € Tyt9)M to the curve 7 can be understood as 


d 


Dy = 1% 


' \ (3.15) 


Matching with notation from calculus courses, this tangent vector is sometimes 
denoted as y'(to). Then this tangent vector acts on differentiable functions f : 


M — R by 
So) = are 7) 


Example 3.4.4 (Gauss Map). Consider a regular oriented surface $ in R® with 
orientation n : S — S?. (Recall from calculus that the orientation is a choice of a 
unit normal vector to S at each point such that n : S$ — S? is a continuous function.) 
In the local theory of surfaces, the function n is often called the Gauss map. The 
differential of the Gauss map plays a central role in the differential geometry of 


I (to)(f) = (4 


(3.16) 


to 


94 


3. Differentiable Manifolds 


surfaces. In that context, we define the differential of the Gauss map dn, at a point 
p€S in the following way. 

A parametrization X (u,v) of a coordinate patch U around p amounts to the 
inverse X = x~! of a chart « : U > R2. Similarly, on S2, the parametrization 
N = noX is the inverse of a chart y on S? of a neighborhood of n(p). Since 
N:U > R3 is a unit vector, by the comments in Section 2.2, we know that Ny 
and N, are perpendicular to N and hence are in the tangent space 7,5. Hence, we 
often identify T,S = Ty(p)(S?). Let X(t) = X(a(t)) be any curve on the surface 
such that X(0) = p. Then dnp is the transformation on T,5 that sends a tangent 


vector X'(0) € Tp(S) to #(N(4(t)))] 

Via the association of 7/(0) — D, between the classical and the modern def- 
inition of the tangent space, we see that the classical definition of the differential 
of the Gauss map is precisely Definition 3.4.1. (Note that Figure 3.8 specifically 
illustrates the differential of the Gauss map.) 

Over some neighborhood of n(p), the function N : z(U) — S? gives a parametriza- 
tion of a coordinate neighborhood of n(p) on S?. Write the coordinate functions 
as x(q) = (#1(q), v2(q)) and similarly for y. Then the associated bases on T,,S and 
Tn(p) (S?) are 


(7) es 
{sr sa} identified as {X,, X,} and 
() + 4 
cae oO identified as {Ni, Nv}. 
Thus, with respect to the coordinate charts x and y as described here, the matrix 
for dnp is 


[dnp] = (a‘), where N; = ai Xy + arXo, 


where by X;, we mean 0X /Ax'. It is not hard to show that 


1 1 = 
at a5 gu 912 Ly, Ly 
=- : 3.17 
(% 2) & i) & | eet) 
where gi; = x, = 3 j and Li; = Xe .N. In classical differential geometry, this matrix 


equation for the coefficients a‘ is called the Weingarten equations. Equation (3.17) 


is written as 
n 


aj = S- g Lj; 
k=l 
where g’? are the components of the inverse matrix (gx:)~+. 


Corollary 3.4.5 (The Chain Rule). Let M, N, and S be differentiable manifolds, 
and consider F: M + N andG: N > S to be differentiable maps between them. 
Then 

(GoF), =G,oF,. 
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More specifically, at every point p € M, 
d(Go F), = dG Fp) ° dF, : 


Proof. By Proposition 3.4.2, for all functions h from a neighborhood of G(F'(p)) on 
S to R and for all X ¢ T,M, we have 


(Go F),(X)(h) = X(ho Go F) = (F,(X))(hoG) 
= (G.(F.(X)))(h) = (Ge 0 F,)(X)(h): 


Definition 3.4.1 for the differential avoids referring to any coordinate neighbor- 
hood on M. In contrast to the matrix for the differential introduced in Chapter 1, 
the matrix for the differential df, of maps between manifolds depends on the co- 
ordinate charts used around p and f(p), according to Equations (3.13) and (3.14). 
We can, however, say the following about how the matrix of the differential changes 
under coordinate changes. 

Proposition 3.4.6. Let f: M— N be a differentiable map between differentiable 
manifolds. Let x = (x',...,2™) and Z = (z',...,%™) be two coordinate systems in 
a neighborhood of p, and let y = (y',...,y") andy = (y',...,y”) be two coordinate 
systems in a neighborhood of f(p). Let [dfp| be the matrix for df, associated to the 
xz- and y- coordinate systems and let [df] be the matrix of the differential of f but 
expressed in Z-coordinates in the domain and y-coordinates in the codomain. Then 


ga 
Pe) (el (Ser, ) 


Proof. (The proof is left as an exercise for the reader.) 


at 


[af] = (+ 


PROBLEMS 


3.4.1. Let F : R™ — R” be a linear transformation. Show that under the identification 
of T,(R*) with R* as described in Example 3.3.7, F. is identified with F. 


3.4.2. Consider a differentiable manifold M™ and a real-valued, differentiable function 


h: M™ +R. Apply Proposition 3.4.2 to show that h.(X) corresponds to the 


differential operator 
d 


dt 
on functions g : R — R, where we assume we use the variable t on R. 


he(X) = X(h) 


3.4.3. Let T? be the torus given as a subset of R® with a parametrization 
X (u,v) = ((2 + cos v) cos u, (2 + cos v) sin u, sin v). 
Consider the sphere S? given as a subset of R®, and use the stereographic atlas 
{an,7s} as the coordinate patches of S?. Consider the map f : T? + S? defined 
by 


x 
wt. 
|=" 


Explicitly calculate the matrix of the differential df,, with p given in terms of 
(u, v)-coordinates for (u,v) € (0,27)? and using the stereographic atlas on the 
sphere. 
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3.4.4. 


3.4.5. 


3.4.6. 


3.4.7. 


3.4.8. 


Let S? be the 3-sphere given in R* by S? = {u € R* : ||u|| = 1}, and let S? be the 
unit sphere in R?, where we use coordinates (a1, 22,23). Consider the Hopf map 
h:S? > S? given by 

h(ur, v2, us, us) = (2(wi ue + usta), 2(urus — UUs), (aq ug) —(ag + u4)). 


(Note: the description of h is equivalent to the one given in Problem 3.2.7.) 


(a) Show that this map indeed surjects S® onto S?. 


(b) Show that the preimage h~'(q) of any point q € S? is a circle on S?. [Using 
the notation of 3.2.7, show that h~1(1: z2) is the circle in C? parametrized 
by y(t) = (Re, Rzze) with R = 1/,/1 +4 |z2|? and h7*(0: 1) is the circle 
in |C? parametrized by (0, e**).] 


(c) For a coordinate patch of your choice on S? and also on S?, calculate the 
differential dh, for points p on S*. 


Consider the map F' : RP* —> RP? defined by 
F(a@:y:z:w) = (2? —y?: vyz —2aw? + 23: 23 + Qy2? — by?z — w*). 


This function is homogeneous, and the result of Problem 3.2.8 ensures that this 
map is differentiable. Let p = (1:2: —1:3) € RP. 


(a) After choosing standard coordinate neighborhoods of RP? and RP? that 
contain, respectively, p and F(p), calculate the matrix of dF, with respect 
to these coordinate neighborhoods. 


(b) Choose a different pair of coordinate neighborhoods for p and F(p) and 
repeat the above calculation. 


(c) Explain how these two matrices are related. 


Let f : U > R be a function of class C? over an open set U C R?. Use the 
coordinates (u, v) that arise from the parametrization of the graph by (u, v, f(u, v)). 
Define n : M -> S? to be the function that returns that upward pointing unit 
normal vector of M, as a subset of R°. 


(a) Find the matrix of dnp for an arbitrary point p € M where we use the atlas 
{ry, ms} for charts on S?. 


(b) Find the matrix of dn, for an arbitrary point p € M where we use the inverse 
of the parametrization N(u,v) as a coordinate chart for the image set n(U) 
in S?. 


Example 3.1.6 shows that if we give R® the coordinates (xo, 11,2), there is a 
natural surjection f : R® — {(0,0,0)} — RP? via m(ao,21, 22) = (xo : 11,22). 
Consider the unit sphere S? (centered at the origin), and consider the map g : 
S? + RP? given as the restriction of f to S?. Using the oriented atlas on the 
sphere given in Example 3.7.3 and the coordinate patches for RP? as described in 
Example 3.1.6, give the matrix for dg, between the north pole patch tn and Uo. 
Do the same between the north pole patch and U; and explicitly verify Proposition 
3.4.6. 


Prove Proposition 3.4.6. 


3.5. Manifolds with Boundaries 
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Figure 3.9: Open subsets in a half-space of R?. 


3.4.9. Let Mi and M2 be two differentiable manifolds, and consider their product manifold 
M, x Me. Call m : Mi x Mz > M; for i = 1,2 the projection maps. Show that 
for all points p1 € Mi and po € Mz, the linear transformation 

S: Tp, po) (M1 x M2) — Tp, M1 @ Tp Mo, 
X +—> (1«(X), 72% (X)) 


is an isomorphism. 


3.5 Manifolds with Boundaries 


Despite the flexibility of the definition of a differentiable manifold, it does not allow 
for a boundary. In many applications, it is useful to have the notion of a manifold 
with a boundary. This notion relies on the concept of a Euclidean half-space. 


Definition 3.5.1. Let @ be a unit vector in R”, i.e., dE S”. The half-space Hz is 
Hz ={£eER"|z%Z-a> 0}. 
The boundary of the half-space is OHz = {% € R"|Z- a = O}. 


Note that for distinct unit vectors @ and b, the half-spaces Hz and H; are not 
equal. 

Since the topology on Hz is the subset topology inherited from R”, a set is open 
in Hg if and only if it is equal to UM Hz for some open set U C R”. Figure 3.9 
depicts a Euclidean half-space of of R? along with two open subsets. One open set 
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arises as an open square which is already a subset of Hz and so does not include any 
point of its boundary. The other open set arises as the intersection of an open disk 
with Hz. This intersection includes the segment along OH, the line perpendicular 
to a. 

Definition 3.5.2. A differentiable n-manifold M with boundary has the same def- 
inition as in Definition 3.1.3 except that the ranges for the charts are open subsets 
of a half-space H of R”. The boundary of the manifold, written OM, is the set of 
points p such that in some coordinate chart ¢: U > H, where H is a half-space, 
o(p) € OH. 


The most commonly used Euclidean half-spaces in R” are the upper half-space 
and the lower half-space, defined respectively as 


Ri = {(x1, £2, eas Lin) € R” | In = 0} = FA(o,...,0,1) 
R" = {(x1, £2, ah Bi) € R” | Las 0} = HA(o,...,0,—1)- 


For any unit vector @, define the projection function 7g : Hg > OHz as 
ta(Z) = %— proj, ® = #— (4- X)a. (3.18) 


Since 0H is a (n—1)-dimensional subspace of R”, assigning coordinates to elements 
of 0H gives an isomorphism (and homeomorphism) between R”~! and OH;j. 


Proposition 3.5.3. Let M be a differentiable (respectively, C*, smooth, analytic) 
n-manifold with boundary. Its boundary OM is a differentiable (respectively, C*, 
smooth, analytic) (n — 1)-manifold without boundary. 


Proof. Let A= {¢a}aer be an atlas for M. Let I’ be the subset of the indexing set 
I such that the domain of ¢g contains points of OM. For a € I’ with dg : Ua - H 
for some half-space H, we consider the projection 7 : H + OH as a mapping into 
R"-}, 

By definition, the restricted function wv. = (mo ba) | sag is a bijection. It is 
continuous, as the restriction of the composition of two continuous maps. The 
projection function 7 is an open function, so maps open sets to open sets. Hence, 
since ¢, is continuous and maps open sets to open sets, then so does Wa. This 
shows that wg is a homeomorphism from Uz 0M to its image. 

Furthermore, the domains of w, for a € I’ cover 0M. 

Finally, consider two overlapping charts ¢q : Ua — Hg and ¢g : Ug > Hg with 
a, € I’. Consider the transition function 


Po 0 Pg = 4 ($a 0 $3) OME": THO Ga(Ua N Us) > 140 b(UaN Us). 


Then 1" : 1° da(Ua AUB) > ba(Ua M Ug) is a smooth injection and 7 : 
ba(Ua Ug) 4 R"~! is a smooth projection. Hence, the differentiability class of 
Yo 0 Wg” is the same as the differentiability class of ¢q 0 $5". 

This shows that the collection A’ = {Wa}aer equips OM with the same differ- 
ential (respectively, C* or smooth) structure as M. 
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Example 3.5.4 (Closed Interval). Let M = [a,b] be a closed and bounded real 
interval. Define ¢, : [a,b) > Ry by ¢i(a) = x —a and ¢2 : (a,b] > R_ as 
2(Z) = &—b. The set {¢1, d2} equips [a,b] with the structure of a manifold with 
boundary. Note that the boundary 0M = {a,b} is a discrete manifold, consisting 
of exactly two points. 


Example 3.5.5 (Closed Ball). Example 3.1.5 inspires a relatively easy way to equip 
the closed unit ball B? = {(z, y,z) € R?| a? + y? + 2? < 1} with the structure of a 
manifold with boundary. Consider first the function ¢1 : B? 9 {(z,y,z) € R°|z> 
0} — R? defined by 


b1(2, y, 2) = (x,y; V Lae oa = 2) 


We can visualize in Figure 3.3 the domain of this function as the portion of the 
closed ball inside the dome corresponding to X (1): Since z < ,/1— 2? — y? in the 
domain of $1, then the codomain of ¢, is R%. Furthermore, ¢1(«,y,z) € OR%, 
if and only if \/1— 2? — y? — z = 0, which is precisely the portion of the unit 
sphere in {(x,y,z) € R?|z > O}. It is easy to see that ¢; is continuous and also a 
homeomorphism with its image. 

We can create in a similar manner charts ¢2, 43, $4, 5, and ¢g corresponding 
to X (2); X (3); X 4); X(5); and A (6): 

As constructed, the union of the domains of ¢; for i = 1,2,...,6 is not all of B? 
but only B? —{(0,0,0)}. To remedy this situation, it suffices to enlarge the domains 
of at least one of the ¢; functions to include (0,0,0). For example, using as the 
domain of ¢; as the set 


Uy = Bn {(2,y, 2) € R®| a? +y? + (2-1)? <2} 


suffices. The open set U, in B? includes the same portion of the manifold’s boundary 
as the open set B? 9 {(z,y, z) € R®| z > O}, but also includes (0, 0,0). 

We leave it as an exercise for the reader to show that the transition functions 
between coordinate charts are differentiable. Therefore, the atlas {¢1, ¢2,...,¢6}, 
equips the closed ball B? with the structure of a manifold with boundary. The 
boundary OB° is the unit sphere S?. 


Example 3.5.6. As an example of a manifold with boundary, consider the half- 
torus in R? given as the image of X : [0,7] x [0,27] > R3, with 


X (u,v) = ((2 + cosv) cos u, (2 + cos v) sin u, sinv). 


The image of X is a half-torus M with y => 0, which, to conform to Definition 3.5.2, 
is easily covered by four coordinate patches. The boundary 0M is the manifold 
consisting of two connected components: 

Re = {(x,y,2)| (w+ 2)? +2? =1, y =O}, 

R= {(x,y, 2) | (x — 2)? +27 = 1, y= Of. 
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We leave it as an exercise for the reader to decide on precise patches that make this 
half-torus into a manifold with boundary. (See Exercise 3.5.2.) 


Since manifolds with boundaries are topological spaces, the concept of a con- 
tinuous map between them is still the same as Definition A.2.26. Furthermore, the 
concept of a continuous or differentiable map between manifolds with or without 
boundaries remains essentially the same the original Definition 3.2.1 but with one 
clarification. 

Let f : MM” — N” be a continuous function from a manifold M with boundary 
to any other manifold. Deciding the limit or the differentiability of f a point p< M 
that is not on the boundary OM is the same as always. However, suppose that 
p € OM, with (U,x) where x has for codomain a half-space of R", a coordinate 
neighborhood of p, where H is a half-space of R™, and (V,y) a coordinate neigh- 
borhood of f(p). Then the function yo f oa? described in Definition 3.2.1 has the 
domain x(U M f~!(V)), which is an open subset of a half-space H. Since p € 0M, 
then 2(p) € OH. In order to decide on the differentiability of yo f oa at «(p), 
we only consider the condition and the limit in Definition 1.2.14 for h such that 
x(p) + h € H. This is a restricted limit, which generalizes a one-sided limit from 
calculus of a single variable. 

The concept of a manifold with boundary, allows us to generalize the notion of 
a curve on a manifold. 


Definition 3.5.7. A differentiable curve on a manifold M, possibly with boundary, 
is a differentiable function y : J ~ M, where J is any interval of the real line and 
is understood as a one-dimensional manifold, possibly with boundary. 


This definition allows for curves with endpoints on a manifold. 

Since curves on manifolds served an essential role in defining the tangent space 
to a manifold at a point, curves with endpoints help us define the tangent space 
to a manifold with boundary M at any point p, even if p€ OM. We now restate 
Definition 3.3.1 to accommodate manifolds with boundary. 


Definition 3.5.8. Let M™ be a differentiable manifold (possibly with boundary) 
and let p be a point on M. Let J be some interval of R containing 0 and let y: J > 
M be a differentiable curve on M with y(0) = p. For any real-valued differentiable 
f defined on some neighborhood of p, we define the directional derivative of f along 
y at p to be the number 


“F())| (3.19) 


D,(f) = dt ror 


where this derivative is understood as a one-sided derivative, if 0 is an endpoint of 
the interval J. The operator D, is called the tangent vector to y at p. 


Though this definition adds nothing new for points p € M that are not on the 
boundary OM, including curves with endpoints allows us to consider curves whose 
endpoints are on the boundary. 
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Figure 3.10: Tangent vectors to point on the boundary. 


Figure 3.10 depicts a manifold with boundary M, specifically a half sphere, along 
with a curve y : [—1,0] > M such that 7(0) = p. The figure illustrates what occurs 
in regular calculus, visualizing a tangent vector as a vector in R?. However, even in 
this context, the tangent vector 7/(0) must be understood as a one-side derivative, 
namely, ; 

7'(0) = jim 7 (o(h) — (0). 
It is not surprising then that in the more abstract setting of manifolds with bound- 
ary, (3.19) should involve a one-sided derivative. 

Section 3.3 showed that the set of tangent vectors to a manifold M at a point p 
is a vector space. 

It takes a little more work to show that set of tangent vectors to M at p, even 
when p € OM, is a vector space. For example, showing that the set of tangent 
vectors is closed under scalar multiplication breaks into two cases. Suppose that 
€ > 0 and that y : [-e,0] > M is a curve on M with (0) = p. Then ifa € Ry, 
defining 7 : [—e/a,0] > M by y(t) = y(at) will give D,, = D,. However, if 
a € R_, defining 7 : [0,¢/a] — M by y(t) = y(t) leads to D,, = D,. The issue 
here is that we needed to change the domain of 72 when a < 0. However, combining 
cases, we see that the set of tangent vectors is closed under scalar multiplication. 

We leave as an exercise for the reader the technical details for the following 
proposition. 


Proposition 3.5.9. Let M be a manifold with boundary OM and suppose that 
pe€OM. The set of tangent vectors to M at p forms a vector space. 


As with manifolds without boundary, we call the set of tangent vectors to M at 
p the tangent space to M at p and denote it by T7,M. 
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With the notions developed in this section, Definition 3.4.1 for the differential 
of a map between manifolds does not need to change for the generalized context of 
manifolds with boundary. 


PROBLEMS 
3.5.1. Explicitly show that the solid ball B” = {(x1,...,an) © R" |a7 ++: +a? <lhisa 
smooth n-manifold with boundary and show that its boundary is the sphere S"~*. 


3.5.2. Referring to the parametrization X (u,v) in Example 3.5.6, give four coordinate 
patches that equip the half-torus with the structure of a manifold with boundary. 


3.5.3. Prove that all the transition functions in Example 3.5.5 are differentiable. 


3.5.4. This exercise shows how to equip the closed ball B® = {(2,y,z) € R?|2?7 + y? + 

z* < 1} with the structure of a manifold with boundary in a manner inspired 
by stereographic projection. Let N = (0,0,1) and define the function Hy : Be — 
{N} > R3 as follows. For all A € B?—{N}, let B be the intersection of the line AN 
with the unit sphere S*. The IIw(A) = (tw (B), A) where wn is the stereographic 


projection as in Example 3.1.4 and where as vectors BA = ABN. 


(a) Calculate In (x, y, z) explicitly. 


(b) Show that the third component of Hy(a,y,z) is equal to 0 if and only if 
(x,y,z) € S’ — {N}. 


(c) Show that Ily is a homeomorphism between B® — {N} and {(z,y,z) € 
R?|0<z< 1}. 

(d) Let S = (0,0,—1) and define Hy in a similar fashion as Hy. If we call 
(a, 0, W) = Is o Wy (u, v, w), show that 


u(1 — w) v(1—w) w(1—w) ) 


w+teyvtw'wtytw’uto2+w 


(u,v, W) = ( 


and deduce that Ig 0 Hy’ is differentiable on its domain. 


[These steps show that the atlas {I1y, Is} equips B® with the structure of a dif- 
ferentiable manifold with boundary.| 


3.5.5. Show that if M is a compact manifold, then so is 0M. [Hint: See Definition A.2.51.] 


3.5.6. Modify the approach in Section 3.3 to prove Proposition 3.5.9. 


3.6 Immersions, Submersions, and Submanifolds 


The linear transformation F,, which is implicitly local to p, and the associated 
matrix [dF] allow us to discuss the relation of one manifold to another. A number 
of different situations occur frequently enough to warrant their own terminologies. 


Definition 3.6.1. Let fF: M — N be a differentiable map between differentiable 
manifolds. 
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Figure 3.11: Double cone. Figure 3.12: Enneper’s surface. 


1. If F, is an injection at all points p € M, then F is called an immersion. 
2. If F, is a surjection at all points p € M, then F is called a submersion. 


3. If F is an immersion and one-to-one, then the pair (M, F) is called a subman- 
ifold of N. 


4. If (M, F) is a submanifold and F : M — F(M) is a homeomorphism for the 
topology on F'(M) induced from N, then F is called an embedding and F(M) 
is called an embedded submanifold. 


It is important to give examples of the above four situations. Clearly, every 
embedded submanifold is a submanifold and every submanifold is an immersion. 
In fact, in the theory of differentiable manifolds, it is only in the context of Def- 
inition 3.6.1 that we can discuss how a manifold “sits” in an ambient Euclidean 
space by considering a differentiable function f : M — R”, where R” is viewed as 
a manifold with its usual differential structure. 

These three categories represent different situations that we addressed when 
studying regular surfaces in R?. The cylinder S! x R is a differentiable manifold. We 
can consider the double cone in Figure 3.11 as the image of a map f : S' x R = R? 
given by 

f(u,v) = (vcosu, vsin u,v), 


where we use u as an angle. We note that 


—vsinu cosu 
[df} = | vcosu sinu 
0 1 
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Figure 3.13: Not a homeomorphism. 


Clearly, at all points (u,0) € S* x R, the differential df, is not injective. Thus, the 
cone is not an immersion in R°. 
Enneper’s surface (see Figure 3.12) is the locus of the parametrization 


3 3 
X(u,v) = (u - > +uv?,v— > + vu, uw? — *) for (u,v) € R?. (3.20) 


Enneper’s surface can be considered a differentiable map of X : R? > R3. It is not 
hard to check (Exercise 3.6.1) that according to Definition 3.6.1, Enneper’s surface 
is an immersion, but because X is not one-to-one, the surface is not a submanifold. 
(In Figure 3.12, the locus of self-intersection of the parametrized surface is indicated 
in thick black or thick white.) 

To illustrate the idea of a submanifold that is not an embedded submanifold, 
consider the ribbon surface in Figure 3.13. We can consider this surface to be a 
function between manifolds in the following sense. Consider the two-dimensional 
manifold without boundary M = (0,5) x (0,1) with the natural product topology 
and differential structure inherited from R?. Then the ribbon surface can be viewed 
as the image of a differentiable map f : (0,5) x (0,1) + R® between manifolds. 
One of the (open) ends of the ribbon comes arbitrarily close to the surface of the 
ribbon (“touching” but not intersecting). The pair (M, f) is a submanifold but not 
an embedded submanifold because, as Figure 3.13 shows, open sets on M might 
not be open sets on f(M) with the topology induced from R?. Note that no open 
set V of R° around f(p) can intersect f(M) in to obtain the set f(U), where U is 
the open neighborhood around p that is depicted by a darker gray circle. 

Some authors (usually out of sympathy for their readers) introduce the theory of 
“manifolds in R".” By this we mean manifolds that are embedded submanifolds of 
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R”. Though not as general as Definition 3.6.1, that approach has some merit as it 
more closely mirrors Definition 3.1.1 for regular surfaces in R?. However, our current 
approach is more general. Admittedly, it might seem strange to call a differentiable 
map a submanifold, but, as the above examples show, this tactic generalizes the 
various situations of interest for subsets of R°. Furthermore, this approach again 
removes the dependence on an ambient Euclidean space. Consequently, it is not at 
all strange to discuss submanifolds of RP” or any other space of interest. 

We now wish to discuss specifically embedded submanifolds of a differentiable 
manifold since they occupy an important role in subsequent sections and allow us 
to quickly determine certain classes of manifolds. 


Proposition 3.6.2. Let M™ be a differentiable manifold. An open subset S of M 
is an embedded submanifold of dimension m. 


Proof. Let {@; : U; > R™ hier be the atlas of M. Equip S with the atlas {¢;|5}icr. 
The inclusion map 4: S > M is a one-to-one immersion. The topology of S' is 
induced from M, so S', with the given atlas, is an embedded submanifold of M. 


Example 3.6.3. Consider the set Myx, of n x n matrices with real coefficients. 
We can equip Myx» with a Euclidean topology by identifying Myx, with R”’. 
In particular, Myx» is a differentiable manifold. Consider the subset GL,,(R) of 
invertible matrices in M,,,,. We claim that, with the topology induced from Myx», 
GL,,(R) is an embedded submanifold. We can see this by the fact that an n x n 
matrix A is invertible if and only if det A 4 0. However, the function det : Mnxn > 
R is continuous, and therefore, 


GL, (R) = det ~'(R — {0}) 
is an open subset of M,,. Proposition 3.6.2 proves the claim. 


The proof of Proposition 3.6.2 is deceptively simple. If S C M, though the 
inclusion map u: S — M is obviously one-to-one, one cannot use it to show that 
any subset is an embedded submanifold. Consider the subset 5 of R? defined by 
the equation y? — x? = 0 (see Figure 3.14). The issue is that in order to view S as 
a manifold, we must equip it with an atlas. In this case, the atlas of R? consists of 
one coordinate chart, the identity map. The restriction of the identity map id|g is 
not a homeomorphism into an open subset of R or R? so cannot serve as a chart. 
In fact, if we put any atlas {¢;} of coordinate charts on S, the inclusion u : S — R? 
is such that 1 0 ¢; will be some regular reparametrization of t +> (t?, 9), ie., 


vo b(t) = (9(t)”, g(t)”), 


where g’(t) 4 0. Hence, 
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Figure 3.14: Not an embedded submanifold of R?. 


and thus dz, fails to be an immersion at the point where g(t) = 0, which corresponds 
to (0,0) € S, the cusp of the curve. 

Having a clear definition of the differential of a function between manifolds, we 
can now imitate Definition 1.4.1 to give a definition for regular points and for critical 
points of functions between manifolds. 


Definition 3.6.4. Let f: M™ — N” bea differentiable map between differentiable 
manifolds. Then any point p € M is called a critical point if rank(f,) < min(m,n), 
ice., df, is not of maximal rank. If p is a critical point, then the image f(p) is called 
a critical value. Furthermore, any element g € N that is not a critical value is called 
a regular value (even if gq ¢ f(M)). 


We remind the reader that this definition for critical point directly generalizes all 
the previous definitions for critical points of functions (see the discussion following 
Definition 1.4.1). The only novelty here from the discussion in Chapter 1 was to 
adapt the definition for functions from R™” to R” to functions between manifolds. 

Our main point in introducing the above definition is to introduce the Regular 
Value Theorem. A direct generalization to a similar theorem for regular surfaces (see 
Proposition 5.2.13 in [5]), the Regular Value Theorem provides a class of examples 
of manifolds for which it would otherwise take a considerable amount of work to 
verify that these sets are indeed manifolds. However, we need a few supporting 
theorems first. 


Theorem 3.6.5. Let f: M™ > N” be a differentiable function between differen- 
tiable manifolds, and assume that dfp is injective. Then there exist charts @ around 
p and w around f(p) that are compatible with the respective differential structures 
on M and N and such that f = wo f od! corresponds to the standard inclusion 


(@1,.--;L%m) +> (@1,---,2%m,0,...,0) 


of R™ into R”. 
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Proof. Let @ and Ww be charts on M and JN, respectively, for neighborhoods of p 
and f(p). If necessary, translate ¢ and ~ so that ¢(p) and w(f(p)) correspond to 
the origin 0. Then, with respect to these charts, as matrices [df], = [df]g, so dfs 
is injective by assumption. The image of the linear transformation dfg : R™ > R” 
is a subspace of R”. Thus, by a rotation in R” (applied to the chart ~), we can 
assume that Im(dfj) is {(1,-..,2%m,0,...,0) € R”}. 

We wish to change coordinates on R” via some diffeomorphism h : R” — R” 
that would make f the standard inclusion. We view R” as R™ x R"—-™ and define 
the function h : R” — R” by 


h(x, y) = (f(x) a (0, ce 4054) = (1m 2 f(x),y), (3.21) 


where 7 is the orthogonal projection of R” onto the subspace of its first m-dimensional 
components. Note that the differential of h at 0 is dhg = dfg @ id or, as matrices, 


fang] = (“Hal 9). 


‘hae 


Since d fs is injective, we see that dhg is invertible. 

Now, by the Inverse Function Theorem (Theorem 1.4.5), there exists some open 
neighborhood V of 0, such that h is injective on V, h(V) is open, and the inverse 
function h—! exists and is differentiable. Thus, h is a diffeomorphism between open 
neighborhoods of the origin in R". We reparametrize the neighborhood of f(p) with 
the chart h~+ ow. By Equation (3.21), replacing w with wy’ = h~! ow leads to an 
atlas that is compatible with the atlas on N. Furthermore, by construction, the 
new f satisfies 


f= ofog (a) =h* o f(x) = (z,0,...,0) 


as desired. 


The functional relationship discussed in the above proof is often depicted using 
the following diagram: 


M™ N” 
od Y 
R”™ R” 


f 
We say that the diagram is commutative if, when one takes different directed paths 
from one node to another, the different compositions of the corresponding functions 
are equal. In this simple case, to say that the above diagram is commutative means 


that 7 
yof=fog. 


108 


3. Differentiable Manifolds 


These kinds of diagrams are often used in algebra and in geometry as a schematic 
to represent the kind of relationship illustrated by Figure 3.7. 

Theorem 3.6.5 offers a strategy to prove a number of theorems about embedded 
submanifolds. We mention a few of these here and leave some others for the reader 
to prove. 


Corollary 3.6.6. Let M™ be a differentiable manifold, and let SC M. The subset 
S is an embedded submanifold of M of dimension k if and only if, for all p € S, 
there is a coordinate neighborhood (U,¢) of p compatible with the atlas on M such 
that 


(UNS) ={(2',..., 0°, a°*1,...,2 € o(U) |e =... = 2™ =O}. 


Proof. The implication (=) follows immediately from Theorem 3.6.5. 

For the converse (<=), assume that for all p € S, there is a coordinate neighbor- 
hood (U, ¢) in M compatible with the atlas of M satisfying the condition for UNS. 
We cover $ with a collection of such open sets {Ug M Shaer. Let 7: R™ > R* 
be the projection that ignores the last m—& variables. Then on each coordinate 
neighborhood, ~q = 7° ¢q : Ua S — R* is a coordinate chart for S. Since 
dba : Ua + ba(Uq) is a homeomorphism and since 


baa NS) Cc {(2',...,0*,2°41,...,27) eR” |a*t1 =... =2™ =O}, 


then 7 0 ¢q is a homeomorphism onto its image. By definition, dg 0 ¢;1 is differen- 
tiable for any pair of indices a and 3. However, wg ow,! = 70 (¢g0¢,") has and so 
is differentiable as well. Consequently, {(Ua MS, Wa)}aer forms an atlas on S that 
gives S the structure of a differentiable manifold. Furthermore, the inclusion map 
satisfies all the requirements of an embedded submanifold. 


The following theorem is similar to Theorem 3.6.5 but applies to local submer- 
sions. 


Theorem 3.6.7. Let f: M™ + N” be a differentiable map such that f, :T,M —> 
Ty(p)N is onto. Then there are charts @ at p and w at f(p) compatible with the 
differentiable structures on M and N such that 


pof=nod, 


where 7 is the standard projection of R™ onto R” by ignoring the last m—n variables. 


Proof. (The proof mimics the proof of Theorem 3.6.5 and is left to the reader.) 


Theorem 3.6.8 (Regular Value Theorem). Suppose that m > n, let f: M™ > N” 
be a differentiable map, and let q be a regular value of f. Then f—'(q) is an embedded 
submanifold of M of dimension m — n. 
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Proof. Since m > n and q is a regular value, then for all p € f~'(q), dfp has rank 
n. 

We first prove that the set of points p € M, where rankdf, = n is an open 
subset of M. Let {(Ua,¢a)} be an atlas for M. For all a, define gg : Ua > R, with 
ga(p) as the sum of squares of all n x n minors of [df,]. Note that there are ("") 
minors in [df,] and that, for all a, each function gq is well defined on the coordinate 
patch U,. The functions gy need not induce a well defined function g: M —> R. 
(We would need galu.nus = 98|UanuUs for all pairs (a, 3).) The equation ga.(p) = 0 
holds if and only if all the maximal minors of [df,] are 0, which is equivalent to 
rank df,. However, rank df, is independent of any coordinate system, so regardless 
of choices made in the construction of of ga, we have g,1(0)N Ug = 93 (0) NUg. 
Define Va = g, (IR — {0}). Since each gq is continuous, V, is open and the set 


Vel IVs 


ael 


is an open subset of M. By construction, V is precisely the set of points in which 
rankdf, = n. Since V is open in M, by Proposition 3.6.2, V is an embedded 
submanifold of dimension m. 

We consider now the differentiable map fly : V > N. Let p € f-+(q). By 
construction of V, the differential df, is surjective for all p ¢ V. Applying Theorem 
3.6.7, we can assume that there is a coordinate chart (U,¢) of p with coordinates 
(x',...,2™) and a chart w of q such that yo f = 70 ¢, where 7 is the projection 
of R™ onto R” by ignoring the last m — n coordinates. Furthermore, by taking 
a translation if necessary, we can assume that ~(q) = (0,0,...,0). Consequently, 
w(q) = 7 0 d(p) and, so in coordinates, 


7 OneStat ee let Sos S0). 


By Corollary 3.6.6, f~'(q) is an embedded submanifold of M. 


The Regular Value Theorem is also called the Regular Level Set Theorem be- 
cause any subset of the form f~!(q), where q € N, is called a level set of f. 


Example 3.6.9 (Spheres). With the Regular Value Theorem at our disposal, it is 
now easy to show that certain objects are differentiable manifolds. We consider the 
sphere S” as the subset of R"+!, with 

(x1)? ae (27)? aie desl Gu = 1. 


The Euclidean spaces R”*+! and R are differentiable manifolds with trivial coordi- 
nate charts. Consider the differentiable map f : R"+! > R defined by 


f(x) = [lel]. 
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In the standard coordinates, the differential of f is 


Qat 
Qu? 


[df] = 


Qartt 


We note that the only critical point of f is (0,...,0) and that the only critical value 
is 0. Thus, S" = f~1(1) is an embedded submanifold of R"*+, and hence, S” is a 
differentiable manifold in its own right when equipped with the subspace topology 
of R"*+. Notice that this establishes S? as an embedded submanifold of R? without 
reference to any charts. 


PROBLEMS 


3.6.1. 


3.6.2. 


3.6.3. 


3.6.4. 


3.6.5. 


3.6.6. 


3.6.7. 


3.6.8. 


3.6.9. 


3.6.10. 


Prove that [AX] (u,v) for the parametrization of Enneper’s surface in (3.20) is 
injective for all (u,v) € R?. [This confirms that the parametrization of Enneper’s 
surface is an immersion of R? in R*.] 

Let M be a differentiable manifold, and suppose that f : M — R is a differentiable 
map. Prove that if f. = 0 at all points of MW, then f is constant on each connected 
component of M. 

Let Mmxn be the set of m x n matrices. Show that the subset of IM +m x n of 
matrices of rank less than or equal to r is an embedded submanifold. 

Show that the square as described in Example 3.1.13 is not an embedded sub- 
manifold of R?. 

Show that the function f : R? + R® defined by f(u,v) = (u? +u+v, u? + uv, v°) 
is smooth and injective but is not an immersion. 

Let N be an embedded submanifold of a differentiable manifold M. Prove that 
at all points p € N, the space T,N is a subspace of T,M. 

Let M™ be a differentiable manifold that is embedded in R”. By Exercise 3.6.6, 
T,M is a subspace of T,(R”) & R”.) Let f : R” > R be a differentiable function 
defined in a neighborhood of p € M. Show that if f is constant on M, then 
f«(v) = 0 for all v € T,M. Conclude that, viewed as a vector in T,(R”), the 
differential df, is perpendicular to T,M. 

Let M be a differentiable manifold, and let U be an open set in M. Define 
tu: U + M as the inclusion map. Prove that for any p € U, the differential 
tx : Ip U — T,M is an isomorphism. 

Let M and N be k-manifolds in R”, in the sense that they are both embedded sub- 
manifolds. Show that the set MU N is not necessarily an embedded submanifold 
of R”. Give sufficient conditions for MU N to be a manifold. 


Suppose that the defining rectangle of the Klein bottle, as illustrated in Figure 3.6 
or 3.16, is [0,27] x [0,27]. It is a well-known fact that it is impossible to embed 
the Klein bottle in R®. Show that the parametrization 


X (u,v) = ((2 + cos v) cos u, (2 + cos v) sin u, sin v cos(u/2), sin v sin(v/2)) 
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gives an embedding of the Klein bottle in R*. (Remark: This parametrization is 
similar to the standard parametrization of the torus in R® as the union of circles 
traced in the normal planes of a planar circle of larger radius. A planar circle in 
R* admits a normal three-space. The parametrization X is the locus of a circle 
in the normal three-space that rotates in the fourth coordinate dimension by half 
a twist as one travels around the circle of larger radius.) 


3.6.11. Define O(n) as the set of all orthogonal n x n matrices. 


(a) Prove that O(n) is a smooth manifold of dimension $n(n — 1). 


(b) Consider the tangent space to O(n) at the identity matrix, T7(O(n)), as a 
subspace of the tangent space to Mnxn (which is Mn xn itself). Prove that 
A€ Mnxn is a tangent vector in T;(O(n)) if and only if A is skew-symmetric, 
ie, AT =—A. 

3.6.12. Prove Theorem 3.6.7. 
3.6.13. Let M™ and N” be embedded submanifolds of a differentiable manifold S*, and 
suppose that m+n > s. Let pe MON. We say that M and N intersect 
transversally at p in S if T,M+7,N = T;,S, viewed as subspaces of T,5' by 


virtue of Problem 3.6.6. In this exercise, you will show that if M and N intersect 
transversally at each point of Mn N, then MN is a differentiable manifold. 


(a) Let p € MAN. Prove that there is a coordinate chart (U,¢) of p and a 
function f, : U > R°~™ such that UM M = f;1(0). [This also shows that 
there is a coordinate chart (V,w) of p and a function fz : V > R°~” such 
that VAN = fz '(0).] 

(b) Consider the function F : UNV > R*~™ x R®*” defined by F(x) = 
(fi (x), fo(w)). Prove that (0,6) is a regular value. 


(c) Deduce that MN is an embedded submanifold of S. 


3.7 Orientability 


In the final section in this chapter we introduce the notion of orientability. We 
usually first encounter this notion with the example of the Mobius strip M. See 
Figure 3.15. Consider the Mobius strip as a 2-manifold with boundary embedded in 
R°, and consider trying to assign a unit normal vector to every point on the Mobius 
strip in a continuous fashion. Such an assignment corresponds to a continuous 
function n : M —+ S?. It is not hard to see that this is impossible: Once a unit 
normal vector goes around the strip once, it will be pointing in the other direction. 

Since manifolds do not necessarily exist in some ambient Euclidean space, we 
cannot talk about unit normal vectors to generalize the notion of orientability. 
The Klein bottle gives us another way of visualizing non-orientability. Consider a 
curve on the Klein bottle as shown in Figure 3.16. We point out that the depicted 
curve is differentiable: In this diagram for the Klein bottle, at the point p, the 
“vertical” orientation changes as the curve passes through the vertical boundary of 
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Figure 3.15: Mobius strip. q 


Figure 3.16: The Klein bottle. 


the diagram. Attached to the curve, we show a unit tangent vector to the curve and 
a unit normal as well, which form a variable frame. As we move along the curve 
from left to right starting from the point a, the unit tangent and the unit normal 
change continuously. As we cycle around the diagram of the Klein bottle, passing 
through the vertical boundary on the right, vectors (and directions) are reflected 
vertically. returning to the point a, we do not recover the frame we started with, 
but one with opposite orientation. We point out the we did not have to use a this 
particular frame involving a unit tangent and a unit normal to the curve, or even 
an orthonormal one; this observation would remain the same with any frame. 

Suppose that y(t) continuously parametrizes the curve on the Klein bottle. If 
we call T(t) and U(t) the tangent and normal vectors in this diagram, we see that 
det(T(t) U(t)), which measures the sign of the angle swept from T(t) to U(t) is 
not a continuous function; it must change sign in a discontinuous fashion. This 
observation gives us another way to think about orientability without reference to 
an ambient space. However, this still is not quite enough to motivate a definition. 
Notice that if we had done the same construction with a vertical line connecting 
the two instances of q in the diagram, the same function det(T(t) U(t)) would be 
continuous along the curve. 

The key to the notion of orientability that we do consider from the example of 
the Klein bottle is that of variable frames on the manifold whose determinant does 
not change sign. 


Definition 3.7.1. Let M” be a differentiable n-manifold equipped with an atlas 
A = {bataer. Suppose that for any two charts ¢g and ¢g of the atlas A, the 
Jacobian of the transition function ¢gq = ¢g © ¢,' is positive at all points in its 
domain. Then (M, A) is called an oriented manifold. 


As we saw in Proposition 3.3.9, at any point p € M the matrix of the differential 
[d¢gq] coordinate change matrix on T,M between the basis derived from the ¢q 
coordinate charts and the basis derived from the ¢g chart. As a coordinate change 
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matrix, [ddga] is invertible and hence, its determinant is never 0. 

At present, in our development of the theory of manifolds, we do not have a 
way to connect a tangent space at one point to the tangent space at some other 
nearby point of the manifold. Consequently, we cannot (currently) think of the the 
bases on T,M derived from the ¢, coordinates as a variable frame, since each basis 
is in a different vector space. Nonetheless, the function det(ddg_) is defined over 
dba(Ua M Up). 


Definition 3.7.2. Let M” be a differentiable n-manifold equipped with an atlas A. 
Then M with A is called orientable if there is an atlas B on M that is compatible 
with A such that M equipped with B is an oriented manifold. 


Example 3.7.3. Consider Example 3.1.4 of the sphere, with the atlas A = {7y, 73} 
of stereographic projection from the North and South poles. From the change of 
coordinates (@,0) = 75 0 Ty' (u,v) in (3.1), we find that 


—u?+v? 2uv 
ss O(%, B) (u? + v2)2 (u2 + v2)2 
1 oe ’ —_ 
det(d(7g omy )) = paar Ou ee 
(w+v2)2 (v2 + v2)? 
(—u? + v?)(u? — v?) — 4u?v? 1 
7 (u2 + v2)4 ~  (u2 + v2)2" 


Consequently, the atlas {rv, 75} on S? does not equip S? with the structure of an 
oriented manifold. 

Consider instead the atlas B = {my,7g5}, where 7g is the composition of 7s 
with the reflection (u,v) + (u, —v) so that 7s5(2,y, z) = (7%) —;4). It is easy to 
tell that {7,7} is an atlas for S? that it is compatible with {7y,7s}. Thus this 
atlas gives S? the same differentiable structure as the original atlas. Furthermore, 
writing (@,0) = 7s omy (u,v), we get 

u v 


u = >—> and v= —-——~. 
u2 + v2 uz + v2 


We easily find that the Jacobian is 


u2 — v2 2uv 
OU, 0) _ | (u? + v?)? (u2 + v2)2} 1 
O(u,v) 2uv w—v? ~ (u2 + y2)2" 43:22) 
(u2 + v2)? (u2 + v2)? 


This shows that the atlas {7y,7s} gives the sphere the structure of an oriented 
smooth manifold. So though (S?,A) is not an oriented manifold, it is orientable, 
and (S*,B) is an oriented manifold. 
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Figure 3.17: Sequence in the proof of Proposition 3.7.5. 


In the study of surfaces in R*, an orientation of an orientable surface meant 
choosing one of the two possible directions for a unit normal vector function that is 
continuous over the whole surfaces. With manifolds, this notion of choice is more 
subtle but still exists. 


Lemma _ 3.7.4. Let M be a connected differentiable manifold with atlas 
A = {(Ua,¢a)}aer. For any pair a,a’ € I, there exists a finite sequence a = 
1, 012,...,0n =a of indices in I such that Ua, Ua,,, £9, fori =1,2,...,n—-1. 


Proof. If U.NUa #0, then we are done. However, this need not be true. Let Ca be 
the set of indices a” € I such that there exists a finite sequence @ = a1, Q2,...,Qn = 
a” such that Ug, 1U,;,, £9, fori =1,2,...,n—1. Then 
Pai Ue Sand (YS) 2u, 
aeC beI-C 


are both open as union of open sets and UU V = M by definition of an atlas. We 
claim that UN V = 9. If not, if z € UNV, then there exists x € UM Uy for some 
aé€CandbeI-—C. But this is a contradiction because in this case whatever finite 
sequence of indices that gave a chain of nonempty intersections from Ug to U, can 
be extended by one more so that b € C and not in I —C. Since M is connected 
(see Definition A.2.62), V = @ and C = J and the result follows. 


Proposition 3.7.5. Let M” be an orientable connected differentiable manifold and 
let A= {(Ua, ba) }aer and B = {(Ve,ve)}ae7 be two compatible atlases on M both 
of which separately make M into an oriented manifold. Then either det(d(yg o 
bq')) > 0 for all (a, B) € I x J or det(d(go $y')) < 0 for all (a, B) € Ix J such 
that 30 ¢,' have nonempty domains. 
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Proof. Let a € I be arbitrary and let 6 € J. Then, as a continuous function, 
det (d(w#g0¢z')) is either always positive or always negative on its domain. Suppose 
that det(d(wgo@,")) > 0 and let 6’ € J. By Lemma 3.7.4, there is a finite sequence 
B = Bi, B2,.--,Bn = B" such that Vg, Ve,,, #9. Furthermore, because we can 
think of U, as a submanifold of M, we can assume that Vg, Ua, # 0 for all Vg, in 
the sequence. (See Figure 3.17.) 


Since B equips M with an oriented differentiable manifold structure, then det(d(wg,,,° 


pe) > 0 for alli =1,2,...,n—1. By the chain rule, 


det(d(wer o $a") 
= det(d(we, 05, ,) 0-7 od(bpy 0 b5,') 0 d(da, 0 $5")) 
= det(d(e,, 05! ,)) ++ det(d(th, 0 W5,')) det(d(ws 0 4") 
> 0. 


Suppose alternatively that det(d(wg0@z')) < 0. Then the same composition holds 
but the product of determinants leads to det(d(g- 0 ¢5')) < 0. The result follows. 


Definition 3.7.6. Let M” be an orientable connected differentiable n-manifold. 
Two compatible atlases A = {¢a}aer and B = {Wg}ge, are said to have equivalent 
orientations if the atlas AUB also makes M an oriented manifold. An equivalence 
class of oriented atlases is called an orientation. 


Proposition 3.7.5 shows that on a connected differentiable maniofld there can 
only be two orientations. More generally, if M is a manifold with c connected 
components, there are 2° possible orientations on M. This includes the degenerate 
case of 0-manifolds that correspond to a set of points equipped with the discrete 
topology. In this case, each point can have an orientation of +1 or —1. 


Definition 3.7.7. Let M” be an oriented manifold and let p € M. Any ordered 
basis on T,,M is called positively oriented if its change of coordinate matrix with 
(01, 02,...,0n) has positive determinant. 


We now discuss how orientations on manifolds with boundary M induce orien- 
tations on the boundary manifold 0M. We must make a choice in how to induce an 
orientation on the boundary but do so to conform with Green’s Theorem, Stokes’ 
Theorem, the divergence theorem, and even the fundamental theorem of calculus. 
Recall that Green’s Theorem states that for a compact region R C R? with a 
boundary OR that consists of a finite number of regular curves 


OF» a eee 
—— —- —]dA= F. dr, 
II. ( Ox Oy OR 


for any differentiable vector field F = (F,,F2) on R. The integral on the right 
breaks into the sum of k integrals if OR has k boundary components. Furthermore, 
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Figure 3.18: Orientation of the boundary of a plane region. 


we require that each boundary component be oriented so that as “someone travels 
along the curve” the interior of the region is to the left. Another way to state this 
choice of orientation of the boundary components is that the ordered pair of vectors 
consisting of an outward pointing normal and the direction of travel along the curve 
is a positive frame for R?. (See Figure 3.18.) 

Let M” be an oriented differentiable manifold with boundary. Let p € 0M, 
and let (U,¢) be a coordinate neighborhood of p with coordinates (x1, x?,...,2”). 
Recall that since p € 0M, the coordinate chart ¢ is a homeomorphism onto an open 
subset of half-space Hg of R”. If @ = (a',a?,...,a”) € R”, the tangent vector 


Xq=a'0, +0702 +++» +0"On, 


where 0; = 0/0z', is in T,M but not in T,(0M). We say that Xz is a tangent vector 
that points inward from OM, while —X@z is outward pointing. The induced coordi- 
nate chart on OM is T70¢: UNOM — R”~! with coordinates (ut,u?,...,u"~1). 


Definition 3.7.8. Then the ordered basis (0/0u',...,0/du"~') of T,(0M) gives 
the induced orientation on OM if 


) to) 
(Xa Pr gor] 


is positively oriented on T,,M. 


If the coordinate chart of a point on the boundary of an oriented manifold is 
@:U +R, ie. with @ = (0,...,0,1), then —0, = —0/0x” is a tangent vector 
that points outward from 0M. The ordered basis (01,...,0n—1) of T,(0M) gives 
the induced orientation on OM if (—O,,01,...,0n—1) is positively oriented on T,M. 


Example 3.7.9. Consider the half-torus MM shown in Figure 3.19. The boundary 
OM has two components. The point q is a generic point in a neighborhood that 
contains the boundary component where p is. If (x!, x?) is a coordinate system in a 
neighborhood of p, the boundary component that contains p is given by 2? = 0. The 
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Figure 3.19: Half-torus with boundary. 


figure depicts the ordered basis (0, 02) at the generic point q and also the ordered 
basis (—02,0,) at p. Since these two bases have the same orientation (imagine 
moving the standard basis at gq over to p), then 0; determines the induced orientation 
on OM (as opposed to —0,). 

For the other boundary component, the reasoning is the same except that we 
must use at least one other coordinate chart (Z', Z?) where the boundary is given by 
z? = 0 and the portion of M that is not on 0M has Z? > 0. Intuitively speaking, in 
order for (#1, Z?) to have an orientation compatible with (x', x”), one must switch 
the direction of the basis vector 0/0Z? (from what one would obtain from moving 
0/Ox? over along a line of zt =const.). We must then also switch the sign of 0/dx! 
to get the equivalent 0/0Z! in order to keep a positively oriented atlas. The induced 
orientation on the second boundary component is shown with an arrow. 


Example 3.7.10 (Closed Interval). We set a convention for use later concerning 
1-manifolds. Let y : [a,b] — M bea 1-manifold with two boundary points p; = y(a) 
and p2 = 7(b). Example 3.5.4 gives two coordinate charts that explicitly define [a, b| 
as a manifold with boundary. If p € [a,b), then the basis on T,.M with respect to 
the chart ¢; is {d/dx}, while if p € (a, 0], then the basis on T,M with respect to 
the chart ¢2 is {d/dz}. It is easy to tell that for all p € (a,b), on T,M, we have 
d/dx = d/dz. Clearly, [a, 6] with the atlas {¢1, ¢2} is an oriented manifold. 

The outward pointing vector at po is in the same orientation as d/dzx, so we say 
that po is equipped with a positive orientation. In contrast, the outward pointing 
vector at p; is —d/d%, which is negative with respect to the induced orientation. 


—1 +1 
e ° 
Pi p2 


This association of —1 and +1 to the endpoints as shown above is, by convention, 
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the induced orientation of 7 onto 07. 


PROBLEMS 


3.7.1. 
3.7.2. 
3.7.3. 


3.7.4. 


3.7.5. 
3.7.6. 
3.7.7. 


Prove that a manifold that has a single chart is orientable. 
Prove that every one-dimensional manifold is orientable. 


Let M be a differentiable manifold of dimension 2 or greater that has an atlas of 
exactly 2 charts. Prove that M is orientable. 


Show that the closed ball B® equipped with the atlas {II,, ls} described in Exer- 
cise 3.5.4 does not make B? into an oriented manifold. Modify the atlas to explicitly 
show that B® is orientable. Sketch the ball and indicate with a frame in the tan- 
gent plane to a point on the surface, the orientation that is induced on the surface 
S? = OB?. 
Prove that if M is an orientable manifold with boundary, then 0M is also orientable. 
Show that RP? is not orientable. 


Let M and N be two orientable differentiable manifolds. Show that M x N with 
the product structure is an orientable manifold. 


CHAPTER 4 


Multilinear Algebra 


Many of the objects of interest in differential geometry on manifolds are expressed 
properly in the context of multilinear algebra. Consequently, this chapter introduces 
linear algebraic concepts that are not commonly included in a first linear algebra 
course. The underlying field for all objects outside this chapter is the set of reals R, 
but this chapter introduces the concepts for an arbitrary field K of characteristic 0 
(e.g., Q, R, or C). 

Before jumping in, we mention our habit of notation for components associated 
to certain linear algebraic objects. Let V be a vector space over K with dimV = n. 


If B = (e1, €2,..., @n) is an ordered basis of V, the coordinates of v € V with respect 
to B are 
et 
“2 
lole= _ |, where v=vte;+v7eg+---+U7en. 
ym 


If the basis of V is understood from the problem or if we use a standard basis of 
V, we write [v]. It is common to abuse the notation and say that a vector is equal 
to the n x 1 matrix of its coordinates but we must always be careful to understand 
that components are given with respect to some basis. 

If V is a vector space of dimension n and if B = {e1,e2,...,e,} and B’ = 
{ fi, fo,...,fn} are two bases, there is an n x n matrix PZ that converts the B- 
coordinates of a vector to B’-coordinates. In particular, for all v € V, 


le = PE bs. (4.1) 


This matrix is found by P§, = ([eiJe [ezle -:: [en]e-). Writing the compo- 
nents of P§ as (p’,), where ¢ is the row index and j is the column index, and the B’ 
coordinate of v as (0), we can write (4.1) as 


v= So piv. (4.2) 
j=l 
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As we introduce constructions in multilinear algebra and refer to how the compo- 
nents of objects change under a change of basis, we will refer to (4.2) repeatedly to 
see appropriate generalizations. 


4.1 Hom Space and Dual 


Definition 4.1.1. Let V and W be two vector spaces over K. Denote the set of 
linear transformations from V to W by Homx(V,W), or simply Hom(V,W) if the 
field K is understood by context. 


We can define addition and multiplication by a K-scalar on Hom(V, W) in the 
following way. If T,,T2 € Hom(V,W), then T, + T> is the linear transformation 
given by 


(Ty + T2)(v) er, (v) + Ti (v) for all u € V. 


Also, if \ € K and T € Hom(V,W), define the linear transformation AT’ by 


(AT)(v)  A(T(v)) for all ve V. 


These definitions lead us to the following foundational proposition. 


Proposition 4.1.2. Let V and W be vector spaces over K of dimension m and 
n, respectively. Then Hom(V,W) is a vector space over K, with dim Hom(V,W) = 
mn. 


Proof. We leave it to the reader to check that Hom(V,W) satisfies all the axioms 
of a vector space over K. 

To prove that dimHom(V,W) = mn, first choose an ordered basis 
B = (e€1,€2,.-..,€m) of V and an ordered basis B’ = (fi, fo,...,fn) of W. De- 
fine T;; € Hom(V, W) as the linear transformations defined by 


fi; if 7 =k, 
Tij(ex) = {) if 7 #k 


and extended by linearity over all V. We show that the set {T;;} for l<i<m 
and 1 <j <n forms a basis of Hom(V, W). 

Because of linearity, any linear transformation L € Hom(V,W) is completely 
defined given the knowledge of L(e;) for all 1 < 7 < m. Suppose that for each j, 
there exist mn constants aj in K, indexed by 7 = 1,2,...,m and j = 1,2,...,n, 
such that 


L(ej) = ye ai fi. (4.3) 


Then 


n m 


b= Nats: 


i=1 j=1 
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and hence, {T;;} spans Hom(V, W). Furthermore, suppose that for some constants 
Cij 


n m 
s s cijTij = 0, 
i=1 j=l 


the trivial linear transformation. Then for all 1 < k <™, 


dL eisTis (en) oe Sh =0. 


i=1 j=1 i=l 


However, since {f1, fo,..-, fn} is a linearly independent set, given any k, we have 
Ck = 0 for all 1 <1 <™m. Hence, for all 7 and j, the constants c;; = 0, which shows 
that the linear transformations T;; are linearly independent. 


We conclude that the set {Tj; | for 1 <i<nand1< j < m}, forms a basis of 
Hom(V, W). Consequently, dim Hom(V, W) = mn. 


The proof of Proposition 4.1.2 provides the set of linear transformations {T;,;} as 


a standard basis of Hom(V, W). Furthermore, with respect to these bases, [T, P| a = 
E,;, the n x m matrices where the entries are all 0 except for a 1 in the (i, 7)th 
entry. 


Recall that the matrix (a‘,) described in (4.3), where i is the row index and j is 
the column index, is called matrix representing L with respect to B and B’. Using 
the notation from this chapter’s introduction, we denote this by [L] - =(a'). A 
standard result from linear algebra is that 


[Ze = (Lexa [L(e2)]e +++ (L(em) Ie). 


Clearly, the matrix that represents a linear transformation with respect to cer- 
tain bases will change if the ordered bases change. Let T : V — W be a linear 
transformation. Suppose that dimV = m and dimW =n. Let A and A’ be two 
bases of V, and let P = Py! be the change of coordinate matrix from A to A’. 
Let B and B’ be two bases of W and let Q = OF be the corresponding change of 
coordinate matrix from B to B’. We point out that the process of taking, say, the A 
coordinates of V is a linear transformation [|4:V — R™. With this in mind, the 
following diagram depicts how the linear transformations, coordinates, and matrix 
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multiplications are all related. 


T A 
R™ | ls > R” 
Ja [ | 
pA y— yw Qs, (4.4) 
Ja’ [ |e 
R™ ; R” 
[Tl#: 


In particular, we read off the matrix relationship 


, , —1 
Ts =Q6(T]ePa = Qs (T]8 (Pa) - (4.5) 
To write this relationship as a sum similar to (4.2), we suppose that P~* = (p¥), 


Q = (¢@), [Td = (ai) and [T]# = (i), then 


J 
a= >> aly. (4.6) 


i=1 j=1 


In the particular case of a linear transformation T : V — V, we always as- 
sume that the same basis change occurs simultaneously on both the domain and 
codomain. Hence, P = Q. The relationship in (4.5) becomes 


1 


> 


4 = Pain eh 


making [T]4 and [T]4, similar matrices, and (4.6) changes to 


ap = DD Pp Biay- (4.7) 


Though a particular case of a Hom-space, the dual to a vector space plays a 
critical role in multilinear algebra. We devote the remainder of this section to the 
dual space. 


Definition 4.1.3. Let V be a vector space over K. The vector space Hom, (V, Kk) 
is called the dual space to V and is denoted by V*. Elements of V* are called 
covectors to V. 
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By Proposition 4.1.2, if V is finite-dimensional, then dimV* = dim V and, by 
well-known facts from linear algebra, V and V* are isomorphic. Knowing that two 
vector spaces are isomorphic may seem like there is not much difference between 
them. However, the difference in how coordinates of covectors change versus how 
coordinates of vectors change under a basis change is of foundational importance 
with implications reaching into many areas of mathematics. 


Proposition 4.1.4. If B = (e1,e2,...,€n) is an ordered basis for V, then the linear 
functions e** : V + K, with 1 <i<n such that 


e“(v) =v", whenever [vjg= |: |, (4.8) 


form a basis of V*. In particular, dim V = dim V*. 
Proof. This follows from the basis of Hom, (V,W) exhibited in Proposition 4.1.2. 


We can give an alternate characterization of the functions e**. They are linear 
and satisfy the property that 


re F 1 ift=j7 
e€ w)=5={ 


0 otherwise. 


This 6 symbol is called the Kronecker delta. The Kronecker delta appears repeat- 
edly in multilinear algebra and represents the components of the identity matrix. 

We point out that the map y: V > V* that sends e; to e* for alll <i<n 
(and that is completed by linearity) provides an explicit isomorphism between V 
and V*. This isomorphism depends on the choice of ordered basis. 


Definition 4.1.5. The ordered basis B* = (e*!,...,e*") is called the dual basis or 
cobasis to B = (e1,€2,...,€n)- 


It is important to remember that each e** depends on the whole basis B. There 
is no canonical (without reference to a basis of V) way to define a function v* € 
Hom(V, A’) in reference to a single vector v € V. 


Proposition 4.1.6. Let V be a vector space with two bases A and B and sup- 
pose that \ € V* has coordinates (X;) with respect to the dual basis A* and has 
coordinates (A;) with respect to B*. If Q = QA then 


where q} are the components of the inverse matrix Q7'. 
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Proof. This is a particular case of (4.5) and (4.6). 


In the context of a dual space, it is possible to give a natural interpretation 
of the transpose of a matrix. Suppose that V and W are two vector spaces over 
K and that T € Homg(V,W). There is a natural way to define an associated 
linear transformation W* — V* as follows. Given a linear function g © W%*, the 
composition v + g(T(v)) is an element of V*. Therefore, we call T* : W* > V* 
the transformation such that T*(g) is the unique element of V* that satisfies 


T*(g)(v) = g(T(v)). (4.9) 


As the composition of linear transformations, T* is again linear, and hence, T* € 
Hom(W*,V*). This transformation T* is called the dual of T. 


Proposition 4.1.7. Let V and W be finite-dimensional K-vector spaces with or- 
dered bases A and B, respectively. Let T: V > W be a linear transformation. The 
matriz representing the dual T* : W* — V* with respect to the cobases B* and A* 
is the transpose of the matrix representing T with respect to A and B. In other 


words, = 
y= (rg): 


Proof. Suppose that A = (e1,€2,...,@m) and B = (fi, fo,..., fr), respectively, and 
let A = (a‘) be the matrix representing T with respect to these bases, so that 


T(e;) = Ss ak fies 
k=1 


For all v € V, we can write v as v = vie; + v2en +--+ tue. Then 
9 


w=1 
= f*) (Sore) = fv (Sov Sta] 
i=1 i=l k=1 
Se (Sarvs) Se (Eee) 
i=1 k=1 i=l k=1 
= So v'al = S- al e**(v) 
i=1 t=1 
Thus, as covectors, 
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Hence, the (i, j)-entry of the matrix representing T* with respect to B* and A* is 
(a/), which is the same of the matrix representing T, but with the role of rows and 
columns reversed. The result follows. 


In the previous paragraphs, we emphasized the role of a basis in establishing an 
isomorphism between V and V*. Operations on a vector space that can be described 
without reference to any particular basis are called canonical. For example, the 
definition of the dual of a vector space and the definition of a dual of a linear 
transformation in (4.9) are canonical definitions. On the other hand, when a vector 
space V has an ordered basis B, the isomorphism y : V > V* defined in (4.8) is 
not canonical. 

We now consider the double-dual of V, namely the dual of V*. Given any vector 
v € V, we define the co-covector \, € V** = Hom(V*, K) by 


Av(f) = f(v). (4.10) 
This defines a function A: V > V** by A(v) = Av. 


Proposition 4.1.8. The function A defined by (4.10) is an injective linear transfor- 
mation. Furthermore, if V is finite-dimensional, then A is a canonical isomorphism 
between a vector space V and its double dual V**. 


Proof. We defined A without reference to any basis so it is canonical. 
We first prove that A is a linear transformation. Let v,w € V, and let ce K. 
For all f € V%*, 


A(u + w)(f) = Avtw(f) = fv + w) = flv) + fw) = rv(f) + Aw(F); 
= Mw)(f) + Aw) (P), (4.11) 


so as co-covectors, A(v + w) = A(v) + A(w). Similarly, 


A(cv)(f) = Aco(f) = f(cv) = ef (v) = crv(f) = cA(v) (Ff), (4.12) 


so again, as co-covectors, A(cv) = cA(v). 

Next, we show that A is injective. Let u,,u2 € V be vectors, and suppose 
that A(u1) = A(u2). Thus, f(ui) = f(ue) for all f € Hom(V,K). Therefore, 
f(ur — ug) = 0 for all f € V*, hence wi — ug = 0, and thus, ui = ug, proving that 
A is injective. 

Finally, we prove that if V is finite-dimensional, then A is an isomorphism. Since 
dimW = dimW* for all finite dimensional vector spaces, we also have dimV = 
dim V**. Since A is injective, by the Rank-Nullity Theorem, dim(Im A) = rank A = 
dim V = dim V**. Thus A is surjective. Hence, A is an isomorphism. 


We point out that requiring V to be finite-dimensional for V and V** to be 
canonically isomorphic is not a limitation of the above proof. There exist infinite- 
dimensional vector spaces V with a basis 6 such that a basis of V* has a strictly 
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greater cardinality than |B|. This suffices to show that V and V* and by extension 
V** cannot be isomorphic. 

However, the benefit of the existence of a canonical isomorphism between V and 
V** when V is finite dimensional arises especially in regards to how coordinates 
change under a change of basis. Proposition 4.1.6 shows that coordinates of covec- 
tors change by the inverse of the coordinate change matrix under a basis change on 
V. However, since there is a canonical isomorphism between V and V** coordinates 
of co-covectors change by the regular coordinate change matrix and hence behave 
as regular vectors under a basis change on V. 

As the reader has hopefully noticed, it is standard to use superscripts for the 
index of coordinates of a vector with respect to a basis 6B and to use subscripts 
for the index of coordinates of a covector with respect to 6*. Superscript indices 
are called contravariant indices and subscript indices are called covariant indices. 
Proposition 4.1.8 shows that we do not need three or worse, a countable number 
of, types of index. This distinction between types of indices dovetails with our 
habits of matrix notation: we write coordinates of a vector as a column matrix and 
coordinates of a covector as a row matrix. 

We can now introduce the Einstein summation convention, which shortens cal- 
culations involving components of objects in multilinear algebra. In any expression 
involving the product of components of vectors or matrices (or eventually tensors), 
we will assume that we sum over any index that is repeated in the superscript and 
in the subscript. For example, if 1 <7 <n, then 


n 
ajv! means y ayv? ; 
j=l 


This equation shows the components of a matrix-vector multiplication, Av. Also if 
(c') are real numbers and (e;) is a list of vectors, then under the Einstein summation 
convention 


c'e; means the linear combination cte, + c?e9 +--+ cen. 


As a third example, we could rewrite (4.7) using Einstein summation convention as 
-~k k xi J 
ae = 45 Ga. 

Except for summations involving Tj; in the proof of Proposition 4.1.2, we prop- 
erly used superscript and subscript indices in every equation in this section so that 
the summation symbol could be removed and the expression be correct using the 
Einstein summation convention. Through the remainder of the book, we will use 
the Einstein summation convention, and occasionally write (ESC) to remind the 
reader of this convention. 


We conclude this section with a brief comment about the direct sum of two 
vector spaces. Though not immediately connected to the Hom-space or dual space, 
we mention this construction here since some exercises involve the direct sum. 
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Definition 4.1.9. Let V and W be two vector spaces over a field K. The direct 
sum V @ W consists of the set V x W equipped with operations of addition and 
scalar multiplication defined as: 


def 
1. (v1, wW1) + (v2, we) = (v1 + v2,w1 + we). 
2. c: (v,w) “ (cv, cw) for allce K. 


It is a simple exercise to show that V @ W is a vector space over K. In the 
direct sum V @W, the subset {(v,0) |v € V} is a subspace isomorphic to V and the 
subset {(0, w)|w € W} is a subspace isomorphic to W. By an abuse of terminology, 
we will often say that V and W are subspaces of V @ W, with V and W identified 
according to these natural isomorphisms. 


PROBLEMS 


4.1.1. Let V be a vector space with basis {e1,€2,...,@€n}. Clearly prove that the set of 
functions {e*!,e**,...,e*"} defined in (4.8) form a basis of V*. 


4.1.2. Prove that dim(V @ W) = dimV + dimW. 
4.1.3. Let U, V, and W be vector spaces over a field K. Prove that there exist canonical 
(vector space) isomorphisms 
Hom(U @V,W) = Hom(U, W) ®@ Hom(V, W), (4.13) 
Hom(U,V 6 W) & Hom(U, V) 6 Hom(U, W). (4.14) 
4.1.4. Let Vi, V2, Wi, and W2 be vector spaces over a field K. Suppose that DL: Vi; + V2 
and T : W; > W, are linear transformations with respect to given bases. Define 
the function 
f:MeW, -— V2 8 We 
(v,w) > (L(v), T(w)). 
Suppose that Ai, A2, Bi, and Be are ordered bases on Vi, V2, Wi, and W2 


respectively. Prove that the matrix of f with respect to the basis A; U B, on 
Vi © Wi and to the basis Az U Bz on V2 6 W2 is block diagonal 
4.1.5. Let V and W be finite-dimensional vector spaces over a field K with dimensions 


m and n, respectively, and let f : V — W be a linear transformation. Define the 
linear transformation 


T:V@eW-VEeW 
(v, w) + (v, f(v) + w). 
(a) Prove that the only eigenvalue of T is 1 (with multiplicity m+n). 


(b) Prove that the eigenspace of 1 is E; = Ker f @ W, and conclude that the 
geometric multiplicity of 1 is m+n — rank f. 


4.1.6. Let T : V — V be a linear transformation and suppose that T’ with respect 
to some basis B of V, it has the matrix A = (a;). Using Einstein summation 
convention, prove that a;, which is the trace Tr(A) of A, is independent of the 


basis. 
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4.1.7. Let V be a vector space with an ordered basis B. Let v € V and let f € V*. Sup- 
pose that the coordinates of v with respect to B are (v') and that the coordinates 


of f with respect to B* are (fi). Prove that f(v) is equal to fiv’ (ESC) and show 
that this quantity is independent of the basis. 

4.1.8. Let U, V, and W be vector spaces. Prove that w : Hom(U,Hom(v,W)) > 
Hom(V,Hom(U,W)) defined by ~(T’) = L, where for all u € U and for all vu € V 
the linear transformation L satisfies L(v)(u) = T(u)(v), is a canonical isomor- 


phism. 

4.1.9. Let V = C°([a,b],IR) be the vector space of continuous real valued functions 
defined on the interval [a,}]. For all f € C°([a,0],R), define Ay as the covector 
satisfying. 


Ar (g) =| f(x)g(a) dx for all g € C°({a, 6], R). 


(a) Prove that A: V > V* defined by A(f) = Ay is an injective linear transfor- 
mation. 


(b) Let c € [a,b] and define the evaluation at c as evc(g) = g(c). Prove that 
eve EV". 

(c) Prove that there does not exist f € C°({a,b],R) such that A» = eve. [This 
shows that A is strictly injective.] 


4.1.10. Let V be a vector space, and let W be a subspace. Define the relation ~ on 
vectors of V by 
VL V2 <=> V1 — v2 € W< 


(a) Prove that ~ is an equivalence relation. 

(b) Denote by V/W the set of equivalence classes. Prove that V/W has the 
structure of a vector space under the operations: [v1] + [v2] ev [vi + v2] and 
c: [v] = [cv]. 

(c) Suppose that V is finite-dimensional. Prove that dimV/W = dimV — 
dim W. 


(The vector space V/W is called the quotient vector space of V with respect to 


4.2 Bilinear Forms and Inner Products 


4.2.1 Bilinear Forms on V x W 


Definition 4.2.1. Let V and W be vector spaces over a field K. A bilinear form 
(-,-) on V x W is a function V x W > K such that for all v € V, w € W, and 
ACK, 


(vy + v2, Ww) = (v1, w) + (v2,w), (Av, w) = Av, w), (4.15) 
(uv, wi + we) = (v,wi) + (uv, we) (uv, AW) = A(v, w). (4.16) 


If V = W, then we say (-,-) is a bilinear form on V. 
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We can restate this definition to say that for any fixed wo € W, the function 
x +> (%, Wo) is in V* (corresponding to (4.15)) and that for any fixed v9 € V, the 
function x ++ (vo, 2) is in W* (corresponding to (4.16)). 

The notation used for a bilinear form varies widely in the literature because 
of the many areas in which it is used. In terms of function notation, we might 
encounter the functional notation f : V x W > K or perhapsw:V x W — K fora 
bilinear form and (.,-) or (-,-) for the “product” notation. If V = W, we sometimes 
write the pair (V, f) to denote the vector space V equipped with the bilinear form 


fe 


Example 4.2.2. In elementary linear algebra, the most commonly known example 
of a bilinear form on R” is the dot product between two vectors defined in terms of 
standard coordinates by 


> = 


0 -W=vlwt + v2w2 +--- tu" w”. 


The following functions R” x R” —> R are also bilinear forms: 


v. 
(U, B)o = Qupwy + vows +--+ + UnWn + U1Wn, (4.17) 
v. 


, 1) 3 = Ui W2 — V2}. 


Despite the variety depicted in the above example, bilinear forms on finite di- 
mensional vector spaces can be completely characterized by a single matrix. 


Proposition 4.2.3. Let V and W be finite-dimensional vector spaces, with dim V = 
m and dimW =n. Let (-,-) be a bilinear form on V x W. Given ordered bases A 
of V and B of W, there exists a unique m x n matrix M such that 


(v,w) = [vo] Clue. 


Furthermore, if A = (e1, €2,.--;€m) and B = (uy, U2,...,Un), then the entries of C 
are Ci; = (e;,uj) for 1 <i<mand1l<j<n. 


Proof. Let v € V and w € W be vectors with coordinates 


v w! 
pl=(:] ana [wl=| 
y™ w” 
Then since (-,-) is bilinear, 
(v,w) = (v'e;, wluj) = v'w" (e;, u;). (ESC) (4.18) 


Setting ci; = (e;, uj) and the matrix C = (c,;), for 1 <i < m, we have (e;,u;)w? (ESC), 


which are the coordinates of Cw]. Then (4.18) shows that (v, w) = [v] 'C[w], where 
[v]' means the coordinates of v are written in a row vector as opposed to a column 
vector. 
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Definition 4.2.4. The mn constants (c;;) described in Proposition 4.2.3 are called 
the components of (-,-) with respect to the ordered bases A and B. 


We emphasize that it is appropriate to use two subscript indices for (c;;) in 
light of the comments at the end of the previous section about contravariant and 
covariant indices. Using the Einstein summation convention, we would write the 
evaluation of (v,w) as 

(v,w) = qyv'w. 
This hints that both indices are covariant. This is the content of the following 


proposition. 


Example 4.2.5. Let V = R”, use the standard basis, and consider the bilinear 
forms in Example 4.2.2. First, note that the matrix for the dot product is just the 
identity matrix 


tB=0 G0 Iw. 
For the other forms, it is easy to see that 
0 1 0 0 
1 0 0 0 
(i,a), =o" 0 0 1 Olg 
0 0 0 1 
2 0 1 
0 1 0 
(3,)o = 0" ; w 
0 0 1 
0 1 0 0 
-1 0 O 0 
(3,w)3 =o" w 
0 0 O :: O 


Proposition 4.2.6. Let V and W be finite-dimensional vector spaces. Suppose 
that A and A’ are ordered bases on V and that B and B’ are ordered bases on W. 
Let P = P4\, be the coordinate change matrix on V from A and A’ and let Q = Q%, 
be the coordinate change matrix on W from B and B’. Let (-,-) be a bilinear form 
onV XW with components (c;;) with respect to A and B and with components (x1) 
with respect to A’ and B’. Then 


= i 
Che = PRI Cis, 


where (pi,) are the components of P~' and (@) are the components of Q-+. 
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Proof. (The proof is left as an exercise for the reader.) 


Definition 4.2.7. Let V and W be vector spaces over K, and let (-,-) be a bilinear 
form on V x W. Then (-,-) is called 


1. nondegenerate on the left if for all nonzero v € V, there exists w € W such 
that (v,w) 4 0; 


2. nondegenerate on the right if for all nonzero w € W, there exists v € V such 
that (v,w) 4 0; 


3. nondegenerate if it is nondegenerate on the right and on the left. 


Furthermore, the rank of (-,-) is the rank of its associated matrix with respect to 
any basis on V and W. 


Basic facts about the rank of a matrix imply that if a form is nondegenerate on 
the left, then the number of rows of its associated matrix C' is equal to the rank 
of the form. If a form is nondegenerate on the right, then the number of columns 
of C' is equal to the rank of the form. Hence, a form can only be nondegenerate if 
dim V = dimW. 


4.2.2 Bilinear Forms on V 


Many applications of bilinear forms involve a bilinear form (-,-) on V. 

When we consider the components of a bilinear form on V with respect to bases, 
we always assume that A = B. The components (c;;) described in Proposition 4.2.3 
can be written as an n xX n matrix. In Proposition 4.2.6, we also suppose that 
A’ = B’ so that change of coordinate matrices are equal, P = Q. Then 


= st 
Cee = PrP eCij- 


The matrices of components (c;;) and (€,¢) are not necessarily similar. If they 
were, they would satisfy (4.7). Consequently, though we do depict the components 
of a bilinear form according to Proposition 4.2.3, the matrix does not behave under 
coordinate changes like a matrix that represents a linear transformation. We leave 
it as an exercise for the reader to prove that 


C=(P")'cP I, (4.19) 


where C’ is the matrix (c¢;;) and C is the matrix with components (é¢). Hence, C 
and C are similar only if P is orthogonal. 


Definition 4.2.8. A bilinear form (-,-) on a vector space V is called 


1. symmetric if (y,x) = (x,y) for all x,y € V; and 


2. antisymmetric (or skew-symmetric) if (y,x) = —(a,y) for all z,y EV. 
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From Proposition 4.2.3, we see that (-,-) is symmetric if and only if its component 
matrix C is symmetric, and is antisymmetric if and only if C is antisymmetric. 

By way of example, referring to the three bilinear forms on R” in Example 4.2.2, 
(-,-), is symmetric and nondegenerate; (-,-)2 is nondegenerate but neither symmet- 
ric nor antisymmetric; (-,-)3 is antisymmetric and degenerate. 

Proposition 4.2.10 below gives a key characterization of both symmetric and 
antisymmetric bilinear forms. Its proof repeatedly uses the notion of a perpendicular 
subspace. 


Definition 4.2.9. Let V be a vector space with a bilinear form (-,-). If W isa 
subspace of V, the set 


Wt ={v EV |(v,w) for all w ¢ W} 
is called the (-,-)-orthogonal subspace to W. 


Proposition 4.2.10. Let (-,-) be a bilinear form on a vector space V with dimV = 
n. Let I, denote the k x k identity matric. 


1. If (-,-) is symmetric, there exists a basis B relative to which the component 


matrix 1s 
i Wo 
0 —Iy 0), (4.20) 
0 O 0 


for some nonnegative integers p and q. 


2. If (-,-) is antisymmetric, there exists a basis B relative to which the component 


matrix 1s 
0 kK O 
Sie 010s (4.21) 
0 O 0 
Proof. (1) It is easy to check that a symmetric form (--- ,-) satisfies 
i 
(v,w) = 7 (Qu + wu + w) — (vu —w,u—w)). (4.22) 


Consequently, whenever the restriction (-, -) lee to a subspace W is not trivial, there 
isa w€ W with (w,w) £0. 

Suppose that (., ly #0. Then there exists ef € V with (e{,e,) #0. Defining 
e1 =e}, /./\(e}, e/,)|, we have e1 = (e1,e1) = +1. Let Vi = Span(e,) and W; = Vy. 
Then W; is a subspace with V; 0 W, = {0}. Furthermore, for all v € V, we have 
v —e1(v,e1)e1 € Wi, so Vi + Wy = V. Hence, Vi and W; are complementary 
subspaces. 

TE ae 
We then define V2 = Span(e1, e2) and W2 = V5. By the same reasoning as above, V2 


is not trivial, then there exists some eg € W, with €2 = (e2, e2) = +1. 
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and W2 are complementary subspaces. We repeat this process until (-,-) restricted 
to some W,, is trivial. 

Let B be an ordered basis consisting of (e1, e2,..., ex) permuted so that all the 
vectors with ¢; > 0 come first, followed by any basis of W;. With respect to B, the 
form has the matrix described in (4.20). 

(2) Since (-,-) is antisymmetric (v,v) = 0 for all v € V. If (-,-) is not trivial 
on V, there exist two linear independent vectors e1,u1 such that (e1,u1) 4 0. By 
rescaling one of them, we can assume that (e1,u1) = 1. Define Vi = Span(e, u1). 
The matrix of (-,-)|y, with respect to the ordered basis (e1, u1) is 


O: al 
—-1 O/° 
Define W; = V;1. We note that Vi N Wi = {0}. Clearly, for all v € V, 


v — (v,ur)er + (v,e1)ur € Wy. 


Thus V,+W, = V, so since Vi; NW, = {0}, Vi and W; are complementary subspaces 
in V. 

As in part (1), if (-,-) restricted to W, is not trivial, then we can repeat the 
procedure on W; and construct eg and ug such that (e2,u2) = 1, and so forth. 
We repeat this until (-,-) restricted to W; is trivial. Then define 6 as the ordered 
basis consisting first of (e1,...,ek,U1,---,Ux), followed by an basis of W,. Then 
the matrix of (-,-) with respect to B is (4.21). 


Three specific types of bilinear forms play important roles in this text: inner 
products, symplectic forms, and Minkowski metric. Each leads to a different kind 
of geometry. We mention them here together to show their similar origins. 


Definition 4.2.11. Let V be a vector space over R. An inner product on V is a 
bilinear form (-,-) on V that is symmetric and positive-definite, i.e., (v,v) > 0 for 
all v € V — {0}. We call the pair (V, (-,-)) an inner product space. 


Inner products are often introduced in elementary linear algebra courses. We 
remind the reader that from any inner product space (V,(-,-)), we can generalize 
geometric concepts that originally arise in connection to the dot product on R”. 


e We defined the magnitude of an elements v € V as ||v|| = \/(v, v). 
e The Cauchy-Schwartz inequality holds: |(v,w)| < |]v| ||w]] for all v,w € V. 


e The angle between two vectors is ™, satisfying 0 < 6 < 7 and 


ee 
cos? = Tal el 


e Two elements v and w are orthogonal to each other if (v, w) = 0. 
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e We can perform Gram-Schmidt orthonormalization on V. 


e If we define the function d: Vx V + R° by d(z, y) = ||z—y||, then d satisfies 
the triangle inequality and is a metric on V, making (V, d) into a metric space. 
(See Section A.1.) 


Definition 4.2.12. Let V be a vector space over R. A symplectic form on V is a 
nondegenerate, antisymmetric bilinear form. 


Definition 4.2.13. Let V be a finite-dimensional vector space over R. A Minkowski 
metric, sometimes called a Lorentz metric, on V is a symmetric bilinear form for 
which there exists a basis that has a component matrix of either 


—-1 0 0 1 0 0 

0 1 0 0 -1 0 
or . 

0 O 1 0 0 —1 


4.2.3 Signature of a Symmetric Bilinear Form 


Theorem 4.2.14 (Sylvester’s Law of Inertia). Let (-,-) be a symmetric bilinear 
form on V with dimV =n. Setting r = n—(p+q), the triple of nonnegative 
integers (p,q,1r) arising in (4.20) is independent of the basis. 


Proof. Let B = (e1,€2,..-,€n) be an ordered basis of V with respect to which the 
component matrix of (-,-) is given in (4.20). The rank of (-,-), which is independent 
of any basis, is p+ q. 

Let V; be a subspace of V of maximal dimension such that (-,-) restricted to 
V, is positive-definite. Then dim V; = p’ > p because (-,-) is positive-definite over 
Span(e1,...,é€»). By Proposition 4.2.10, there is a basis B, of V; with respect to 
which that matrix of the form is [,’. 

Let V2 be a subspace of V of maximal dimension such that (-,-) restricted to Vi is 
is negative-definite, i.e., for all v € V2 — {0}, (u,v) < 0. Over Span(ep41,..-,€p+q); 
the form is positive-definite, so dim V2 = p’ > p. By Proposition 4.2.10, there is a 
basis Bz of V2 with respect to which that matrix of the form is —Jq. 

Clearly, Vi N V2 = {0} so the subspace V; + V2 has dimension p’ + q’. Assume 
that p’ > por qd’ > q. Then with respect to B, UBo, the restriction of (-,-) to V; + V2 


1S 
te 0 
OG 23) 


which implies that the rank of (-,-) on V is greater than p+q. This is a contradiction. 
We deduce that dim V; = p and dim V2 = q. Since V, and V2 were defined without 
reference to a basis, the theorem follows. 
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The traditional statement of Sylvester’s Law of Inertia is slightly different: If 
A is a symmetric matrix and 9 is any invertible matrix such that D = SAS" is 
diagonal, then the number of negative elements in D is the same regardless of S. 


Definition 4.2.15. Let (-,-) be a symmetric bilinear form on a finite-dimensional 
real vector space. The triple of nonnegative integers (p,q,1) is called the signature 


of (-,-). 


We point out the following properties and their relation to the signature (p, q,r). 
A symmetric bilinear form is: 


e nondegenerate if and only if r = 0; 
e an inner product if and only if (p,q,r) = (n, 0,0). 


e a Minkowski metric if (p,q,1r) is either (n — 1,1,0) or (1,n — 1,0). 


PROBLEMS 

4.2.1. Prove Proposition 4.2.6. 

4.2.2. Prove Equation (4.19). 

4.2.3. Prove that every inner product on a real vector space is nondegenerate. 

4.2.4. Let (-,-) be a bilinear form on V. Fix v € V and define i, € V* as the element 


such that i,(w) = (v,w). Let B = {e1,e2...,en} be a basis of V, and let B* = 
{e* e*? ...,e*"} be the cobasis of V*. 


(a) Prove that wy: V — V* defined by ~(v) = i, is a linear transformation. 
(b) Prove that in coordinates eae =CT™ [>] 13° where cx = (e;,e~). [Comment 


on notation: we think of [in| as a column vector, whereas we think of 


B* 
[te 8 as a row vector with [aa] = (ills 


(c) Prove that w is invertible if and only if (-,-) is nondegenerate. 


[If (-,-) is an inner product, we denote i, by v’ since it lowers the indices of the 
components of v, i.e., turns a vector into a covector. The components of v’ with 
respect to B* are cjxu"] 


4.2.5. Let (-,-) be a nondegenerate bilinear form on V and refer to the previous exercise 
for notations. Let C be the component matrix of (-,-) with respect to some basis 
B. Define the function (-,-)* on V* by (tu, iv)* = (v, u), or in other words (7, 7)* = 


(i-*(7),2-"(m)). 
(a) Prove that (-,-)* with respect to B* is a bilinear form on V*. Prove also 
that if (-,-) is an inner product, then so is (-,-)*. 
(b) Prove that the component matrix of (-,-)* with respect to B* is C7'. 


(c) Define UV, : V* > V*™* by Wy(p) = (A, )*. Under the canonical isomor- 
phism between V and V**, show that U;, =v for allu€ V. 
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4.2.6. 


4.2.7. 


4.2.8. 


4.2.9. 


4.3 


[In parallel with the previous exercise, if (-,-) is an inner product, we denote V 
by A! since it raises the indices of the components of J, i.e., turns a covector into a 
vector. Writing (c’ . ) as the components of C ~1 the components of A’ with respect 
to B are o* dx] 


Let (-,-) be a bilinear form on V. Let W be a subspace of V. Consider the 
orthogonal subspace from Definition 4.2.9. 
(a) Prove that W~ is indeed a subspace of V. 


(b) Prove that W C W+ if and only if the form (-,-) restricted to W is identically 
0. [When this is holds, W is called an isotropic subspace of V .] 


(c) Prove that if (-,-) is symmetric, then W>9W = {0}. 
(d) Prove that if (-,-) is antisymmetric, it is not necessarily true that Wt NW = 
{0}. 
Let V with dimV = 2k be equipped with a symplectic form. A Lagrangian sub- 
space of V is one in which L+ = L. Prove that dim L = k. 
Let (-,-) be a bilinear form on V and let W, Wi and W2 be subspaces of V. Prove 
the following. 
(a) Wi C We implies Ws C Wy. 
(b) (Wi + W2)t =Wi Ws. 
(c) (Win Wa)> =We + We. 
(d) If (-,-) is nondegerate, then (W+)+ = W. 


Let V be a vector space over C. An inner product (-,-) over V is a function 
Vx V > C that is (1) conjugate symmetric: (x,y) = (y,x) for all x,y € V; (2) 
linear in the first entry; (3) positive-definite. Prove that there is a basis B of V 
with respect to which, for all z,y € V, 


(2,y) =a'yl+a° y+... 4+0"Y, 


where [2] = (#') and [y]z = (y’). 


Adjoint, Self-Adjoint, and Automorphisms 


In applications of bilinear forms to geometry, linear transformations that preserve 
the form play a key role. 

Suppose that V is a finite dimensional vector space and BG is a basis. Let (-,-) 
be a nondegenerate bilinear form with component matrix C' with respect to B. If 
L:V —V isa linear transformation with [L]2 = A, then 


(L(v), w) = (Alv])"Clw] = [oJ AT Chu]. 


There exists a unique linear transformation Li : V — V such that 


(L(v),w) = (v, L'(w)) for all v,w € V. 


4.3. Adjoint, Self-Adjoint, and Automorphisms 


137 


We find the associated matrix A‘ of L' by remarking that if 
[vy] AT C[w] = [v]"C(AT uw) 
for all v,w € V, then A'C = CA? as matrices. Hence, 
A=CtA'C. (4.23) 


Definition 4.3.1. The linear transformation L' such that (L(v),w) = (v, L'(w)) 
for all v,w € V is called the adjoint operator to L with respect to (-,-). 


More generally, let V and W be vector spaces equipped with nondegenerate 
bilinear forms (-,-)y and (-,-)w. Let L: V > W bea linear transformation. Then 
there exists a unique linear map L' : W > V such that 


(L(v),w)w = (v, L"(w))v. 


We also call Li the adjoint of L with respect to these forms. If C, is the matrix 
corresponding to (-,-)y and Cy, is the matrix corresponding to (-,-)w with respect 
to specific bases on V and W, and if A is the matrix representing L, then the adjoint 
matrix Al of L' is 

AO AN 


Example 4.3.2. Let LD: R” — R” be a linear transformation between Euclidean 
spaces, with matrix A with respect to the standard bases. For all v,w € R”, 


L(@)-w = (Av) - w= (Av) ' B= 0 Al G=G- (Al). 


Therefore, the transpose A! is the matrix corresponding to the adjoint of L when 
we assume R” and R™ are equipped with the usual dot product. 


Proposition 4.3.3. Let V,W, and U be vector spaces equipped with nondegenerate 
bilinear forms. Then the following formulas hold for the adjoint: 


1. (Ly + L2)t = L' +L) for all Ly, Ly € Hom(V,W). 
2. (cL)' = cL' for all L € Hom(V,W) and allce K. 
3. (Lg0 L1)t = Li o L} for all Ly € Hom(V,W) and all Lz € Hom(W,U). 


Proof. (Left as an exercise for the reader.) 


We are often lead to consider two particular types of linear transformations 
associated to the adjoint: automorphisms with respect to the form and self-adjoint 
transformations. We describe these in the following paragraphs. 


Definition 4.3.4. Let V be a vector space with a nondegenerate bilinear form 
f =(,-). A linear transformation L : V > V is called self-adjoint with respect to 
this form if L = Lt. We use the same term for the matrix representing L. 
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Example 4.3.5. Consider V = R” equipped with the dot product. A matrix A is 
self-adjoint with respect to the dot product if A = A', hence it is symmetric. 


Because of this example, some authors refer to L* as defined above as the trans- 
pose of L with respect to a form (or forms) and use the word adjoint of a linear 
transformation only in the cases when V and W are vector spaces over C and when 
the form (-,-) is sesquilinear. (A sesquilinear form on a complex vector space is one 
that satisfies conditions (1) and (2) in Exercise 4.2.9. See [31] for a discussion on 
sesquilinear forms.) 


Definition 4.3.6. Let V be a vector space with a nondegenerate bilinear form f = 
(-,-). An automorphism of (V, f) is an invertible linear transformation L : V > V 
such that 


(L(v1), L(v2)) = (v1, v2) for all v1, v2 € V. (4.24) 


The property (4.24) shows that automorphisms preserve the bilinear form. This 
condition is equivalent to (v1, L'(L(v2))) = (v1,v2) for all v1,v2 € V. Since f is 
nondegenerate, then Lt o L = idy, where idy is the identity on V. This gives the 
following proposition. 


Proposition 4.3.7. A linear transformation L : V > V is an automorphism of 
(V, f) if and only if L is invertible with Lt o L = idy. 


If V is a finite dimensional vector space, we could simplify Definition 4.3.6. 
Suppose the V is finite dimensional and that L is a linear transformation satisfying 
(4.24). We can still conclude that L' o L = idy. By properties of functions, 
we deduce that L is injective. By the Rank-Nullity Theorem, an injective linear 
transformation between vector spaces of the same finite dimension is invertible. 
Hence, when V is finite dimensional condition (4.24) implies that L is invertible. 


Proposition 4.3.8. Let (V,f) be a vector space equipped with a nondegenerate 
bilinear form. Then the set S of automorphisms of (V, f) satisfies the following: 


1. S is closed under composition: L10o Lz € S for all Ly, Lz € S. 
2. The identity idy is in S. 
3. If LL € S, then L is invertible and L~1 € S. 
Proof. We have already discussed the first property, and the second is obvious. For 
the third property, note that for all L € S, we have Lt o L = id and L is invertible. 
Thus L~! = Li so Lo Li = idy as well. Furthermore, for all v,w € V, 
(L'(v), Li (w)) = (Lo Lh (v), Lo Li (w)) because L is an automorphism 


~ (0, w) 


Thus, L' is an automorphism. 
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(Using the language of modern algebra, Proposition 4.3.8, along with the asso- 
ciativity of linear transformations, shows that the set of automorphisms of (V, f) is 
a group. This group is denoted by Aut(V, f).) 

If for a vector space V has an ordered basis B = {e1,€2,...,@n}, then (4.23) 
leads to a characterization of matrices of automorphisms. Let C be the matrix 
associated to the bilinear form f, and let A be the matrix of a linear transformation 
L:V —V in reference to 8. Then by Proposition 4.3.7, L is an automorphism if 
and only if 

Ae Gt Al. (4.25) 


Example 4.3.9. Example 4.2.5 indicates that the dot product is a symmetric, 
nondegenerate bilinear transformation with associated matrix [,,, and Example 4.3.2 
shows that the transpose of a matrix is the adjoint of a matrix with respect to the 
dot product. However, consider the symmetric bilinear forms f; and fz on R* given 
by the matrices 


Oo ay 1620 O20. 
t-0 70.10 OO. 28 
seal ce ee a ant Mo 0 100 
0001 1000 


Then a simple calculation using (4.23) shows that the adjoint of A = (a;;) with 
respect to fy; is 


a22 412 432 a42 
Ai a21 G11 431 G41 

a23 13 433 G43 |’ 

a24 414 434 G44 

and the adjoint of A with respect to fo is 

a44 434 G24 414 
Ah 443° 433 423 413 

G42 432 422 412 

G41 431 421 411 


Now just consider (-,-)1, the matrix A correspond to a self-adjoint linear transfor- 


mation if 


Qo 8 


ace oe 


Sie 


wie Q& Q 


A matrix A corresponds to an automorphism if and only if A‘A = Jy, which is a 
system of 16 quadratic equations in the 16 variables of the entries of A. Many of 
these equations will be redundant. 
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Proposition 4.3.10. If (V, (-,-)) is @ vector space with a nondegenerate symmetric 
form, then L'' = L for all L € Hom(V,V). 


Proof. By definition, (L(v),w) = (v,L'(w)) for all v,w € V. Since the form is 
symmetric, (L(v), w) = (w, L(v)). Thus, 


(w, L(v)) = (uv, L'(w)) = (L"(w), v) = (w, LM (v)). 


Since these equalities hold for all v,w € V and since (-,-) is nondegenerate, we 
conclude that L = L'. 


Example 4.3.11. Consider V = R” and consider the dot product as a symmetric 
bilinear form. We know from elementary linear algebra that an n x n matrix is an 
automorphism (of the dot product) if A’ = Aq}, ie., if A is orthogonal. The set 
of n x n orthogonal matrices is denoted by O(n). This is the set of isometries of 
R” that fix the origin. The special orthogonal group SO(n) defined in Section 2.3, 
which are all orthogonal matrices with determinant 1, is the set of rotations in R”. 
In particular, if n = 2, all orthogonal matrices have the form 


cos? —esin@ 
sin@ ecosé }’ 


for some angle 0 and some ¢ = +1. Such a matrix is in SO(2) when ¢ = 1. 


Example 4.3.12 (Lorenztian Transformations). Minkowski spacetime is R* 


equipped with the Minkowski metric. It is common to denote Minkowski space 


by R*1. Points are located by the quadruple (x°, «1, «?, x3), with (x', x”, x3) serv- 


ing as space variables and x° represents time. 
With respect to the standard basis of R*, the Minkowski metric is 


g((2°, 2*, x, 2°), (y°,y*,y?,y*)) = —2°y? + ary +07y? + 2% y%, 


-1 0 
Ge ( : : . 
Note that C-!=C. 


We propose to study the automorphisms of the Minkowski metric. Using block 
diagonal properties, it is not hard to see that any linear transformation with matrix 


(0 x) 


where R € O(3), is an automorphism of this form. This corresponds to an orthogo- 
nal transformation in the space variables. Also using block diagonal properties, we 


can see that 
+1 0 
0 Ls 


so the representing matrix is 
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are the only two matrices corresponding to automorphism that fix all the space 
variables. 

To understand automorphisms that intermingle the space and time variables, 
we consider the situation on R? where the Minkowski metric has the matrix 


(0 4): 


For a generic matrix A, the adjoint with respect to this bilinear form is 


wnerwen(2 NE 98 )-( 2) 


The matrix A represents an automorphism when At A = J, which is equivalent to 
aa*— ce? =1 
ab—cd=0 
—b? +d? =1. 


As the first equation parametrizes a hyperbola, there exist ¢; = +1 and u € R such 
a = €,coshu and c = sinhu. By the second equation, we deduce that b = cd/a = 
de, tanhu. Then from —b? + d? = 1, we deduce that 


—d? tanh? u + d? = d? sech? u = 1, 


so d = €2coshu for some ¢3 = +1, from which we also deduce 6 = €,€2sinhu. 
Hence, the matrix A has the form 


a= é€,coshu ¢€,é€2sinhu 
~ \. sinhu e2coshu }° 


This gives uniquely the all the matrices representing automorphisms of the Minkowski 
metric on R'!. The set of automorphisms on R*' is the smallest subset of GL4(R) 
closed under multiplication and taking inverses that includes every matrix of the 


form 
€,coshu ¢€,e€2sinhu 


sinh u €2 cosh u and (( : ; 
0 Ig 


where R € O(3). This describes all automorphisms on R*. 

To apply this to special relativity, set t = 2° and « = x!. We imagine that 
one observer © has frame axes t and x, and a second observer O with axes ¢ and 
& travels with respect to O along the x-axis with velocity v. In the #Z-frame, the 
t-axis, namely 7 = 0, is the trajectory of O, namely the line with equation vt = x 
in the O frame. Thus, there are nonzero constants A and yz such that 


1\  feicoshu eyégsinhu r yf St cosh u + €,é€gv sinh u 
P\o) ~\ sinhu €2 coshu Av) sinh u + €9u coshu ‘ 


1 
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From the component, we deduce that v = —e2 tanh wu. Since sech” u = 1—tanh? u, 
we get coshu = 1/V1—v? and sinhu = —é2u/V1—v?. Using the variable v, we 
have 


El E1U 
— 92 a2) 

A=| Vigg? vir |. (4.26) 
Vl—-v V1—v? 


Since the range of tanhu is (—1,1), we still have all the automorphisms of the 
Minkowski metric, assuming that ¢, = +1, e2 = +1, and v € (—1,1). In GL2(R’), 
the subset of matrices of this form consists of 4 connected components, each being 
a curve parametrized by v € (—1,1) and designated by the four possible values of 
the pair (€1, €2). 

Returning to the full context of Minkowski space R**+!, the automorphisms 
include 


Ey. E1U 


0 0 
a a Sl 
2 2 
Aw) =|" Fae Vrow ° ° (4.27) 
0 0 1 0 
0 0 01 


This is an example of a Lorentz transformation associated to the vector (0,v,0,0)!. 
It is also clear that the Minkowski metric is invariant under any orthogonal transfor- 
mation in the (#1, «?, x?) variables. The group of automorphisms of the Minkowski 
metric is called the Lorentz group and, as a subset of GL4(R), consists of any fi- 
nite product of matrices of the form (4.27) and orthogonal matrices in the space 
variables. We denote the Lorentz group by SO(3, 1). 

Generalizing (4.27), if U is some vector in the space variables with ||v|| < 1, then 
the Lorentz transformation associated to vis the linear map R* — R* that has the 
matrix A(v) obtained by the composition M A(||v||)M~+, where M is some rotation 
matrix that sends the unit x-direction vector to @/||U||. Exercise 4.3.9 gives the 
exact value of this matrix. 

As in the case of Minkowski space R!1, the freedom of choosing values of ¢1 
and €2 implies that O(3,1) has 4 connected components. Only the component 
corresponding to €; = €g = 1 contains the identity matrix I,. This subset is called 
the restricted Lorentz group and is denoted SO*(3,1). Matrices in the restricted 
Lorentz group are called Lorentz transformations. As we will see in Section 7.2, 
Lorentz transformations play a central role in special relativity. 


PROBLEMS 


4.3.1. Let (-,-) be a nondegenerate bilinear form on a finite vector space V. Prove that 
the set of self-adjoint linear transformations is a vector subspace of Hom(V, V). 


4.3.2. Let (-,-) be a nondegenerate form on V and let L € Hom(V,V). Let C be the 
component matrix of (-,-) with respect to a basis B and let A be the matrix 
representing L. Determine the matrix representing L't and conclude that if (a) 
is not symmetric, then it is not necessarily true that L't = L. 
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4.3.3. 


4.3.4. 


4.3.5. 


4.3.6. 


4.3.7. 


4.3.8. 


4.3.9. 


Let (V, f) be a vector space equipped with a nondegenerate bilinear form. Prove 
that the set of automorphisms is not closed under addition. 


Let (V, f) be a vector space equipped with a nondegenerate bilinear form. Prove 
that the set of self-adjoint transformations is closed under composition. 


Let L be an automorphism of an inner product space. Prove that the eigenvalues 
of ZL are 1 or —-1. 


Let V and W be finite vector spaces over a field kK. Suppose that V and W are 
equipped with nondegenerate bilinear forms denoted by (, )v and (, )w, respec- 
tively. Let L : V > W be a surjective linear transformation, and let L' be its 
adjoint, namely, Lt: W > V satisfies 


(L(v),w)w = (v, L'(w))v 
for allu € V and we W. 


(a) Show that L’ is injective. 


(b) Assume in addition that for all v € V with v 4 0, (v,v)v #0. Then show 
that Ker L and Im L' are orthogonal complements in V, that is: 


(i) Ker LA Im Li = {0}; 
(ii) for all v1 € Ker Z and v2 € Im Li, we have (v1, v2)v = 0; and 
(iii) all v € V can be written as v = v1 +v2, where v1 € Ker L and v2 € Im Lt. 
[Hint: Let 6 = LoL’: W + W. Show that ¢ is invertible. For all 
v € V, let vg = (L' 0 dg 1 0 L)(v) and set v1 = v — v2; show that 
v, € Ker L.] 


The definition for an isometry on R” is any function f : R” — R” satisfying 
d(f(Z), f(¥)) = d(z,9¥) for all Z,7 € R”. Prove that any isometry on R” that 
fixes the origin is an orthogonal transformation. [Hint: If f is an isometry, first 
prove that f preserves the dot product between any two vectors; prove that f maps 
parallelograms to parallelograms; deduce that f is linear.] 


Let @ = (0,u,0,0)' and # = (0,v,0,0)' be two vectors in Minkowski spacetime 
R't3. The matrix A(v)A(u) represents the Lorentz transformation from an observer 
O to O” in which O” moves relative to an observer O’ along the x-axis with velocity 
v and O’ moves relative to an observer O along the x-axis with velocity u. Prove 


that 
A@)AGu) = A (7°) 


[This is the velocity addition law in special relativity.] 


Consider the velocity vector ¢ in R'*? with ¢ = (0,1, v2, v3)" and v = ||é|| < 1. 
Call y = 1/V1 — v? and let & = U/||v|| = (0, ui, u2, uz). Show by direct verification 
that 
Y UY —v27 —U3y 
Ce ee ui(y—1)  wiua(y—-1)  wus(y— 1) 
suey uiue(y—1) 1+us(y-1) — uaus(y— 1) 
—usy uius(y— 1) uaus(y— 1) 1+. 48(y— 1) 
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is an automorphism of the Minkowski metric, has A(0) = I[4, and satisfies 
1 
= 0 
A(@) =H] 9 for some p € R. 
0 


This is the matrix described at the end of Example 4.3.12. 


4.4 Tensor Product 


So far in this chapter, we have studied the dual of a vector space, the Hom-space, and 
bilinear forms on vector spaces. This section generalizes those constructions through 
what is called the tensor product. The order of presentation clearly betrays this 
book’s mathematical bias in contrast to a physicist’s approach: We first present the 
structure of a tensor product abstractly and only in the subsequent section discuss 
the components of a tensor and how they change under a change of basis. 

(The following construction is a little abstract. The casual reader may feel free 
to focus attention on the explanations and propositions following Definition 4.4.2.) 

Let U, V and W be vector spaces over a field K. Recall that a function f : 
V x W > U is called a bilinear transformation if f is linear in both of its input 
variables. More precisely, f satisfies 


f(v1 + v2, w) = f(v1, w) 2% f (v2, w), f(Av, w) = Af(v, w), 
f(v, wir we) = f(v,w1) Tr f(v, wa), f(v, Aw) = Af(v, w), 


for all v1, v2,u € V, for all wi, wo,w € W, andall Ac K. 

It is crucial to point out that a bilinear transformation V x W — U is not 
equivalent to a linear transformation V @W — U. A linear transformation T : 
V @W AU satisfies 


T(vi1,w) + T (v2, w) = T((v1, w) + (v2, w)) = T(v1 + ve, 2w). 


for all v1,v2 € V and w € W. The differs from the first axiom of a bilinear 
transformation. This observation motivates the following important proposition, 
which leads to the concept of tensor product. 


Proposition 4.4.1. Let U, V, and W be vector spaces over a field K. There exists 
a unique vector space Z over K and a bilinear transformation w :V x W — Z such 
that for any bilinear transformation f :V x W — U, there exists a unique linear 
transformation f : Z—+U such that f = fow. 


Proof. We first prove the existence of the vector space Z. Consider the set Z of 
formal finite linear combinations 


c1(v1, W1) + C2(v2, wW2) Spaeete st ci(v1, W1), 
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where v; € V, w; € W, and G € K for 1 <i <1. It is not hard to see that Z is 
a vector space over K. Consider now the subspace Zj;, spanned by vectors of the 
form 


(v1 + vg,w) — (v1, w) — (va,w), (Av, w) — A(w, w), (4.28) 

(v, wi + we) — (v,w1) — (v,w2),  (v, Aw) — A(v, w). : 
Define Z as the quotient vector space Z = Z / Zin. The elements of Z are equivalence 
classes of elements of Z under the equivalence relation u ~ v if and only if v — u € 
Zin. (See Exercise 4.1.10 for a description of the vector space quotient.) 

Define ~) : V x W — Z as the composition i = 70 i, where 7: Z > Z is the 
canonical projection andi: V x W > Z is the inclusion. The space Zin is defined 
in such a way that the canonical projection 7 turns w into a bilinear transformation. 

Now given any bilinear transformation f : V x W — U, we can complete f by 
linearity to define a linear transformation 7 from Z to U via 


f(cr(v1, wr) +++ +e,(v,wi)) = crf (v1, wi) +++: Fer f (vi, wi). 


If z € Zin, then zo is a linear combination of elements of the form in Equation 
(4.28). However, every element of the form given in Equation (4.28) maps to 0 under 
f, so f(z). Therefore, if 21,22 € Z are such that z1 ~ 22, then z — z2 = 29 € Tia: 
so f(z, — z2) = 0 and {a= f(z). Hence, f induces a function i Z/Zin — U. 
It is easy to check that f is a linear transformation and that f = f ow. Since the 
image of 7 spans Z/Zjin, it follows that the induced map f is uniquely determined. 
This proves the existence of Z. 

To prove uniqueness of Z, suppose there is another vector space Z’ and a bilinear 
transformation ~' : V x W — Z’ with the desired property. Then there exist 7) and 
w’ such that ~ = Wow and y = Wow’. Then we have w = 07’ ow. However, 
w = idz oy, and since we know that w factors through Z with a unique map, then 
wow’! =idz. Similarly, we can show that w’ 07 = idgz. Thus, Z & Z’, and so Z is 
unique up to a canonical isomorphism. 


As a first comment about the proof, we observe that Z is a “large” vector space; 
its basis consists of every pair (v,w) with v € V and w € W, so has cardinality 
|\V x W|. 

We can depict the functional relationships described in this proposition by the 
following commutative diagram. 


Vxw 


oS 


U 
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Observe that f and w are bilinear transformations while f is linear. Consequently, 
though a bilinear transformation V x W — U is not equivalent to a linear transfor- 
mation V 6W — U, it is equivalent to a linear transformation Z > U. 


Definition 4.4.2. The vector space Z in the above proposition is called the tensor 
product of V and W and is denoted by V @ W. The element ~(v,w) in V @ W is 
denoted by v @ w. 


Elements of V ® W are linear combinations of vectors of the form v ® w, with 
vé€Vandwe€W. With this notation, the following identities hold: 
(v1 + v2) @w = v1 @wt+ rw ®w, (Av) @w=NANv@w), 


4,29 
v® (wi +w2) =v@wi+v® we, v@ (Aw) = A(v @w). aa 


Definition 4.4.3. Any element of V ® W is often simply called a tensor. A tensor 
in V @W that can be written as v®w for v € V and w € W is called a pure tensor. 


From the identity c(v ® w) = (cv) ® w, a linear combination of pure tensors is 
just a sum of pure tensors. We remind the reader that even if V and W are not 
finite dimensional, linear combinations always consist of a finite number of terms. 
So every tensor in V @ W is a finite sum of pure tensors. 


Definition 4.4.4. The rank of a tensor t € V ® W is the least integer r such that 
t= V1 @ wi + Vg @ We +++ + Up © Uy 
for some v; € V and w; € W. 


Example 4.4.5. Let V = R°. Let t €V @V bet =7@jJ+7@k. Though currently 
written as a sum of two pure tensors, it is in fact not a tensor of rank 2 but is a 
pure tensor because t = 7'® (7+ k). 


Proposition 4.4.6. A pure tensor v ®w in V @ W is the 0 tensor if and only if 
v=0orw=0. 


Proof. By linearity 0 = 0(v' ® w) = (0v') ®w so 0 @w = 0 and the same is true for 
ifw =0. 

Conversely, assume that v 4 0 and w 4 0. Let B, be a basis of V containing v 
and let Bz be a basis of W containing w. Consider the function f:V x W— kK 
defined by f(x,y) = v*(x)w*(y), where v* € V* is the dual basis vector to v in BT 
and similarly for w*. The function f is bilinear and nontrivial since f(v,w) = 1. 
Using the construction from the proof of Proposition 4.4.1, f : V@W —> K satisfies 
f(v@w) = f(v,w) = v*(v)w*(w) = 1. Hence, v @ w is not the 0 element in 
Vow. 


Proposition 4.4.7. Let U, V, and W be three vector spaces over a field K. There 
exists a unique isomorphism 


(USV)@WFTUGQ(VEW) 
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such that 
(u@v)Q@wHu®(v@w) 


for alucU,veV, andwe WwW. 
Proof. By the identities (4.29) in 
(ui @ v1 + ug @ v2) @ w = (uy @ v1) @w t+ (ue @ ve) @ w, 


so every tensor in (U @ V) @ W is the sum of tensors of the form (u ® v) ® w with 
ué€U,veéV and w € W. Hence any linear transformation from (U @ V) @ W is 
uniquely determined by how it maps tensors of the form (u @ v) @ w. 

Define the function f : (U@V) x W >U®(V @W) by 


ie (d2 ® woe) = Mi ® (vu; ® wi). 


i=1 


By distributivity properties of finite sums, it is easy to see that this is bilinear. By 
Proposition 4.4.1, there exists a unique linear transformation f :(U @V)@®W > 
U @(V @W) satisfying 


s s 
i=1 i=1 
Clearly, f((u@v) @ w) =u® (v@vw). 

We can construct the inverse linear transformation f—! in the same way. We 
already showed uniqueness, but this establishes the existence of an isomorphism 
satisfying the desired property. 


In light of Proposition 4.4.7, the notation U ® V ® W without parentheses is 
uniquely defined. More generally, this allows us to consider the tensor product 
Vi ® Vo ®--- @ Vy of k vector spaces Vi, V2,...,V% over the same field. In this 
general context, we call a pure tensor in V; ® Vz @---@ Vz any element of the form 
Vy ®@ Ve ®--+@ vy. Again all elements of V; ® Vz ®--- @ Vz are finite sums of pure 
tensors. We also denote by V®* the k-fold tensor product of a vector space V with 
itself. 


Proposition 4.4.8. Let V and W be two vector spaces over a field K. There exists 
a unique isomorphism 


V@W=WeV 


such that v@wHew®v for allueV andwewW. 


Proof. (Left as an exercise for the reader.) 
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Proposition 4.4.9. If V and W are finite-dimensional vector spaces over a field 
K. If By = {e1,...,em} is a basis of V and Bz = {fi,..., fn} ts a basis of W, then 


B=(e,8 f;|\Ts ism end 1s 5 0} 
is a basis of V ® W and therefore dim(V ® W) = (dimV)(dimW). 


Proof. For every pure tensor v ® w € V @ W, using the coordinates of v and of w, 
we have 


v@w=(v'e,+---+u"em) Qw =v' (ez @w) ++: +u™(Em @ w) 
=0'(e1@ (wifi to tw" fn) $2 +0" (Em @ (wi fi +++ + w" fr) 
= v'wie; ® fi; (ESC). 


Thus, 6 spans V @ W. 

Now suppose that (using ESC), ce; ® f; = 0. Let BS = {f*',..., f*"} be 
the cobasis of V* associated to By. For each 7 with 1 < j7 < n, the function 
yj : V x W = V defined by y;(v,w) = f*(w)v is bilinear so uniquely defines a 
linear transformation ¢; : V @®W — V satisfying g;(v @ w) = f*(w)v on all pure 
tensors. Since ce; ® f; = 0, then for all jo, 


0 = Pj. (ce: @ 5) = oF f*% ( Fy esc? 7? = ce; 


as an element of V. Since B is a basis of V, then c’/° = 0 for all i, and this is for 
all jo. We conclude that B is linearly independent in V ® W. Hence, B is a basis of 
V @W and so dim(V ® W) = mn. 


Because of Proposition 4.4.9, if t € V @ W, it is common to use two indices to 
index the components of t with respect to the basis B = {e; ® f;}. Saying that the 
tensor t has components (c’’) with respect to B means that 


m n 
t=>°>  c¥e,@ fy. 


i=1 j=1 


We have used the superscript notation for the components of t to be consistent with 
the Einstein summation convention. 

The next propositions illustrate how the tensor product construction directly 
generalizes the Hom space and the space of bilinear forms. 


Proposition 4.4.10. Let V and W be finite-dimensional vector spaces over a field 
Kk. The space V* ® W is canonically isomorphic to Hom(V,W). 


Proof. Consider the function y : V* x W —+ Hom(V,W) defined by y(A, w) = 
(v +» A(v)w). Since it is bilinear, by Proposition 4.4.1 there is a unique linear 
transformation ¢ : V* @ W > Hom(V,W) that maps the pure tensor \ ® w to the 
linear transformation v > A(v)w. 
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The kernel of y consists of all linear combinations 41 ® wy +--+: +Am @ Wm such 
that the function in Hom(V,W) defined by 


Ai(-)wi +++ +Am(-)Wm 


is identically 0. By the identities (4.29), we can assume that {w1,...,Wm} is a 
linear independent set of vectors in W. Thus for each v € V, we have A;(v) = 0 
for all 1 <i<m. Hence each 4; is the 0-map in V*. From this we conclude that 
Ker y = {0}. 

To show surjectivity, let T ¢ Hom(V,W). Let {v1,...,Un} be a basis of V, and 
consider the linear functions {vj,...,v%} (see Equation (4.8) and the subsequent 
explanation). Then the element 


n 


>of @ T(v) 


i=l 


maps to T under y. Therefore, y is also surjective. 


Proposition 4.4.11. Let V and W be vector spaces over a field K. The set of 
bilinear forms on V x W is a vector space with 


for all bilinear forms w 1, w2, and w and for allc € K. Furthermore, the vector 
space of bilinear forms on V x W is canonically isomorphic to V* @ W%*. 


Proof. (The proof is left as an exercise for the reader.) 


PROBLEMS 


4.4.1. Let V be a vector space over a field K. Let v1,v2 € V. Show that in V @ V, we 
have v1 © v2 = v2 ® v1 if and only if v1 and v2 are collinear. 


4.4.2. Let V be a vector space over the field kK. Prove that V © K is canonically 
isomorphic to V. 


4.4.3. Let V and W be finite dimensional vector spaces over a field K with respective 
bases B = {e1,...,en} and B’ = {fi,..., fm}. Let T : V > W be a linear 
transformation with matrix A with respect to the bases B and B’. T determines 
a linear transformation T®? : V @ V — W @ W defined on pure tensors by 


T?? (v4 @ ve) = T(v1) ® T (v2) 


and completed for other elements of V © V by linearity. 
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(a) If V = W = R? and the matrix of a linear transformation T with respect to 


the standard basis is 
2 3 
a=(5 1): 


(b) In general, for any finite dimensional vector spaces V and W and linear 
transformation T, if the coefficients of A are (a;;), find the coefficients of the 
matrix for T®?. 


find the matrix of T®?. 


4.4.4. Let V and W be vector spaces over C, and let S$: V — V andT: W > W be linear 
transformations. Consider the linear transformation S®7T:V @W —-V @W 
defined on pure tensors by 


(S @T)(v @w) = S(v) @T(w). 


(a) Suppose that dim V = 2 and that dim W = 3, with bases {e1, e2} and { fi, fo, fs}, 
respectively. Suppose also that with respect to these bases, the matrices for 


S and T are 
-1 0 2 
€ >) and 1 3 -2 
0 1 4 


Find the matrix for S ® T with respect to the basis for V ® W defined in 
Proposition 4.4.9. 


(b) Suppose that S and T are diagonalizable with eigenvalues \1,...,Am and 
[1,.--,pn, respectively. Prove that S © T is diagonalizable and that the 
eigenvalues of S@ T are Ayu; for 1<i<mand1l<j<n. 


4.4.5. Let V be a vector space over C, and let T’: V — V be a linear transformation. 


(a) Suppose that the Jordan canonical form of T is J = AI. Find the Jordan 
canonical form of T®?. 


(b) Suppose that the Jordan canonical form of T is 


A 1 0 :--- 0 0 

O A 1: 0 O 

0 0 A: 0 O 
. — 

0 0 0:5 A 1 

0 0 0 :-- 0 A 


Find the Jordan canonical form of T®?. 


4.4.6. Prove Proposition 4.4.8. 


4.4.7. Let V, Wi, and W2 be finite dimensional vector spaces over a field K. Show that 
there exists a canonical (independent of a given basis) isomorphism 


V @(Wi 8 We) = (V @Wi1) @ (VV @ W2). 
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4.4.8. Prove Proposition 4.4.11. [Hint: Call Bil(V, W) the set of bilinear forms on Vx W. 
Show that YW : V* @W* = Bil(V, W) defined on pure tensors by Y(A® pz) (v, w) = 
A(v)u(w) gives the canonical isomorphism.] 

4.4.9. Let U, V, and W be vector spaces over a field K. Prove that V* @ W* @U is 
canonically isomorphic to the vector space of bilinear transformations Vx W —> U. 


4.4.10. In the identification V* @ W © Hom(V, W) described in Proposition 4.4.10, show 
that tensors of rank r in V* @ W~ precisely correspond to linear transformations 
in Hom(V, W) of rank r. 


4.4.11. Consider the linear transformation Tr : V* ® V — K defined on pure tensors by 
Tr(A ® v) = A(w). Under the isomorphism Hom(V,V) = V* @ V, show that Tr 
corresponds to the trace of a linear transformation. 


4.5 Components of Tensors over V 


Let V be a vector space over a field kK. Many applications of multilinear algebra, 
in particular to differential geometry, involve tensor products in 


r times s times 


Ai aeienes ‘\ 
V8" g yres & VT QV@---@VQV*@V*@:-- QV". 


For example, Hom(V,V) is V @ V* and the vector space of bilinear forms on V is 
V*®2_ We will see in Section 4.7 that the set (vector space) of all bilinear products 
on V is V @ V*®?, 
Definition 4.5.1. A tensor over V of type (r,s) is an element of V®" @ V*®s, A 
scalar in K is called a tensor of type (0,0). 

Suppose that V has an ordered basis B = (€1, €2,...,@n) and that the associated 
ordered cobasis is B* = (e*!, e*?,...,e*"). By Proposition 4.4.9 the basis of V®" @ 
V*®* associated to B consists of all the pure tensors 


Ci, B+ @E;, Be*? @--- @ ers 


fori, =1,...nand je = 1,...n,foralll <k<rand1<2<-s. This basis confirms 
that dim(V®" @ V*®s) = (dimV)"tS’. The components of a tensor A € V®" @ V*®s 
with respect to B are the n’** values 


Ag fig CK (4.30) 
such that 
A= Ane’ ei, Bein B+ Bei, BE BE? ®--- Be, (ESC) (4.31) 


Note that this formula involves summations over r + s indices, all from 1 to n. 
Following the explanation at the end of Section 4.1, the superscript indices are 
called contravariant indices, while the subscript indices are called covariant indices. 
As in the contrast between a vector space and its dual, the difference between 
contravariant and covariant indices lies in how they affect the transformation of 
components of a tensor under a change of basis on V. 
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4.5.1 Coordinate Changes 


Proposition 4.5.2. Let B and b’ be two bases on a finite dimensional vector space 
V. Let (at) be the components of the coordinate change matrix A from B to B’ and 
let (a3) be the components of Al. Let ae “r be the components of a tensor of 
type (r,s) with respect to B, and let oe ie ‘the components of the same tensor 
T with respect to B'. Then 
pk ko--ky kik ky ede 

Tp tnt, =, yO; capa . Sa eae ae (4.32) 
Proof. Suppose that B = (e1,e2,...,en) and BY = (fi, fo,--., fn). By definition of 
the coordinate change matrix, e; = a¥ f, for all i and by Proposition 4.1.6 e*/ = 
a; f*’. Thus 


“Efe: ® fin @ + @ fr, @F"" @ f'? @---@ fs 
ae pees es, @ €j, @ ++ WEj, ® e*st @ e*s2 Q--@ ertds 


J1J2°°°7 

byige-tip yk k ky, al £ v7 *b s pel, 
ST (ar In) Ge fas) BS 241G fi.) OG fe Gry) eG fT™) 

Bak? .-akapap-.-apTie+) fi, @ far @--@ fr, @ [8 @ f"O @--- Bf, 


By identifying coordinates, the proposition follows. 


Physicists often introduce tensors by saying that an n”** set of quantities in- 


dexed as in (4.30) that change according to (4.32) under a basis change on V “form 
the components of a tensor.” This perspective may be sufficient for various calcu- 
lations but it does not elucidate what a tensor over V is. 

We comment now on the linear algebraic meaning of a few common operations 
on tensors, when viewed from their components’ perspective. 

If A“”""'" form the components of a (r,s)-tensor A and B‘'2"' form the 


Jij2-Js Jij2-Js 
components of a (r,s)-tensor B, then the term-by-term addition 


Chih  Oarde * ae de 

also satisfies (4.32), so form the components of a tensor. This operation corresponds 
to the usual addition of A and B as elements in the vector space V®" @ V*®, 
Similarly, given the components A - of a tensor of type (r,s), the operation of 
multiplying all the components by a given scalar c in the base field K corresponds 
to multiplying the tensor A by the scalar c again as an operation in the vector space 
V8" @V*85, 

It is not hard to check that if oe and ice ‘*« are components of tensors 
of type (r,s) and (t,u), respectively, then the quantities obtained by multiplying 
these components 


4112" a t1tg-+ty kik: e 
W ga Jslila- = OF ja petal 
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form the components of another tensor but of type (r+ ¢,s+u). This operation 
called tensor multiplication or the product of two tensors, corresponds to the bilinear 
transformation 


yer @ pyres x yet @ yrou ae Vert) @ yreere): 


defined by (a, 8) + a@®f. Therefore, this tensor multiplication utilizes the isomor- 
phism 
(Ver Q yee) Q (ve Q See) i yertt Q Yrostu 


Finally, the contraction operation on the components of a tensor 


iyige tp iyige-tp_ik 
Bria sdaaa ide cieeth 
corresponds to setting one contravariant and one covariant index to be the same 
and then summing over that index. (The contraction operation does not have to 
occur on the last indices as in the above equation.) On the indices involved, this 
corresponds to the linear transformation C : V@V* —> K defined on a pure tensor 
by v®A ++ A(v). Exercise 4.4.11 showed that the contraction operation is similar 
to the operation of taking the trace of a matrix along certain specified indices. 
If v € V is a vector and A € V®" @ V**® be a tensor of type (r,s) with s > 1, 
then some writers use the symbol 
viA 


to indicate the (r, s — 1) tensor that corresponds to the contraction along the index 
of v and the first covariant index of A. 


4.5.2 Examples 


Example 4.5.3 (Cross Product). Consider V = R’. The cross product between 
two vectors is a bilinear transformation x : V x V — V, so is a linear transformation 
V®V—-V. In this way of considering it, the cross product is a particular element 
of V* @V*@V. We can describe it through its components expressed in reference to 
the standard ordered basis € = (7, 7,k). We write its components as Cie satisfying 


C=, Che. “oa 
C3; ==-1, Gs ==1,. C= -1, 


and all other components are 0. Suppose that B = (ui, ug, us) is some other ordered 
basis with the change of coordinate matrix P = P§ with components (p',), then the 
cross product expressed with respect to B has the components 

7 ih Sass 

Cot = Pi BsBt Cox: 


Example 4.5.4 (Inverse of a (0,2)-Tensor). As a more involved example, consider 
the components C;,; of a (0,2)-tensor over V with respect to some basis B. Recall 
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that a (0,2)-tensor represents a bilinear form (-,-) on V. Suppose in addition that 
(-,-) is nondegenerate. This is equivalent to the fact that if the Ci; are organized into 
an n x n matrix, then this matrix is invertible. Denote by C” the coefficients of the 
inverse matrix of (C;;). We prove that C” form the components of a (2, 0)-tensor. 

Let P= (p\) be a coordinate change matrix from B-coordinates to some other 


system of coordinates. If (C;.s) are the components of the same object with respect 
to the other basis, then 


GUC C= 6, and. ° OP On=0). (4.33) 
Equation (4.33) gives C'*pi pi C;; = 67. Multiplying both sides by pf, and summing 
over t, we obtain 
Cony Opp, 20,0, = OP Cy =O Cn =i, 
Multiplying both sides by C°? and then summing over a, we get 
CO" nis SON; =p Or", 


Finally, multiplying the rightmost equality by p3 and summing over {, we conclude 
that 
C8’ = papac™’. 


This shows that the quantities C’! satisfy Proposition 4.5.2 and hence form the 
components of a (2,0)-tensor. 

We should ask ourselves whether we can understand this tensor in a coordinate- 
independent way. In fact, we already presented this object in Exercise 4.2.5. The 
components C’) represent the bilinear form (-,-)* on V* defined in that exercise. 
Using notations from there, we see that the components of A, are Cjj;u' and the 
components of \, are Cyeu". Then 


C;ju'C? Creu” = fu’ Chev" = uh Cpu. 
This last expression are the components of (v, u), which confirms that Exercise 4.2.5 


C#* are the components of (-,-)*. 


4.5.3. Numerical Tensors 


Definition 4.5.5. A numerical tensor is a tensor that is not a scalar whose com- 
ponents are the same given with respect to any basis. 


As a first example, consider the Kronecker delta 


oF = ee 
7 0, ift Fj. 
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Under a coordinate change with matrix (p‘) it transforms according to 
5s = PiPAO; = PiDs. 


Since this last expression represents the product of a matrix with its inverse, the 
we see that again 6” is 1 if r = s and 0 otherwise. This should make sense because 
6i represents the identity function on a vector space, and with respect to any basis 
the components of the identity transformation is the identity matrix. Therefore, Oy 
is a (1, 1)-tensor in a tautological way. 

The generalized Kronecker delta of order r is a tensor of type (r,r), with com- 
ponents denoted by ae defined as the following determinant: 


oN ju a1 
jt 32 jr 
oe ft? wwe G2 
ee ju j2 Ir 
Dec ee (4.34) 
ps i tf 
On On OF 


For example, the components of the generalized Kronecker delta of order 2 as 
bis = 95] — 58h, 


which presents 6,3 as the difference between two (2, 2)-tensors, which shows that 673 
is indeed a tensor. More generally, expanding out Equation (4.34) by the Laplace 
expansion of a determinant gives the generalized Kronecker delta of order r as a sum 
of r! components of tensors of type (r,r), proving that | is are the components 
of an (r,7)-tensor. 

Properties of the determinant imply that on oe is antisymmetric in the su- 
perscript indices and also antisymmetric in the subscript indices. Equivalently, 
ee = 0 if any of the superscript indices are equal or if any of the subscript 
indices are equal, and the value of a component is negated if any two superscript 
indices are interchanged a Rueees for subscript indices. We also note that if 
r >n, where we assume c ”" are the components of a tensor over an n-dimensional 


vector space, then ee = 0 io all choices of indices since at least two superscript 
(and at least two subscript) indices would be equal. 

We introduce one more symbol that is commonly used in calculations with tensor 
components, the permutation symbol. Define 


git in — givin 
os 
ie (4.35) 
Egg = OL? 
Jen Juv dn* 


Note that the maximal index n in Equation (4.35) as opposed to r is intentional. 
Recall that a permutation of {1,2,...,n} is a bijection on that set and a trans- 
position is a permutation that interchanges to elements and leaves the rest fixed. 
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A fact in modern algebra (see [25, Theorem 5.5]) states that given a permutation 
on {1,2,...,n}, if we have two ways to write o as a composition of transpositions, 


e.g., 


T =T10TZ0°+*0Ty =TLOTZO++ OT, 
then a and b have the same parity. 


Definition 4.5.6. We call a permutation even (respectively odd) if this common 
parity is even (respectively odd) and the sign of o is 


‘aaron 1, if o is even, 
sign(o) = 
. -1, ife is odd. 


Because of the properties of the determinant, it is not hard to see that 


1, if (¢1,...,%,) is an even permutation of (1,2,...,7), 
ein ey = 4-1, if (i1,...,%,) is an odd permutation of (1,2,...,n), 
0, if (i1,...,%) is not a permutation of (1,2,...,n). 


The permutation symbol is an example for which, despite the apparently proper 
notation, the collection of quantities is not a numerical tensor. Instead, we have 
the following proposition. 


Proposition 4.5.7. Let B and B’ be two bases on a finite dimensional vector space 
V. Let A= (a‘) be the components of the coordinate change matrix from B to B' 
coordinates. The permutation symbols transform according to 


det (A)! = af' al? -- yeaa 


a4 42 uy 


(det(A))~* Eby hn = Gptdg2 + RPE oy: 


Proof. (Left as an exercise for the reader.) 


The generalized Kronecker delta has a close connection to determinants which, 
we will elucidate here. Note that if the superscript indices are exactly equal to the 
subscript indices, then Ong is the determinant of the identity matrix. Thus, the 
contraction over all indices, CL counts the number of permutations of r indices 
taken from the set {1,2,...,n}. Thus, 


eG n) 
Sidr 
gir = 


o Gaal (4.36) 


Another property of the generalized Kronecker delta is that 


jing, = pheed 
E "Ciyenign = Opa 
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the proof of which is left as an exercise for the reader (Problem 4.5.5). Now let aj 
be the components of a (1,1)-tensor, which we can view as the matrix of a linear 
transformation from R” to R”. By definition of the determinant, 
det(a‘) - eng. ts en 
Then, by properties of the determinant related to rearranging rows or columns, we 
have oo, Lee 
ee det(a;) = ee sg vee a 


Multiplying by ¢;,...;, and summing over all the indices 71,...,%,, we have 


n 


Eiy i, det(at) = 8 Fa ait .-- ain 


Ji Jn? 
and since €j,...;,€"!"'" counts the number of permutations of {1,...,n}, we have 
! i) _ shreIn ati || ain 
n! det(aj) = 63073" Ota. (4.37) 


4.5.4 Tensor Fields 


Later in this book, tensor fields on manifolds will play a key role in describing 
structures of interest on manifolds. Before facing that full generality, we briefly 
consider tensor fields on R” from the component perspective. 


Let U be an open region of R” equipped with two coordinate systems (x1, x?,...,2”) 


and (%1,%2,...,%n) and let p € U. For this section, we think of a tensor field 
over U as expressed by a collection of components yea ge where each of these 
is a function U > R. At a given point p € U, these are tensors over the vec- 
tor space TR". The ordered basis associated to the (x',#?,..., 2") coordinates is 
(0/0x',...,0/0x"), while the ordered basis associated to the (z1,Z7,...,Z”) co- 


ordinates is (0/0Z!,...,0/0Z"). The change of coordinate matrix between these 


is 
Ox" 
Oxd p 
Example 4.5.8 (Gradient). Let f : U + R bea differentiable function and consider 


the gradient V fp. It has components Of /Ozx', evaluated at p. This is a tensor of 
type (0,1) because in the (#1, #7,..., 2") coordinates, its components are 


of Of, OF Oe 

Oxi Oxi Ox’ OR3” 
by the chain rule. This satisfies (4.32) for a (0,1). Consequently, in the expression 
Of /Ox’, though the i appears as a superscript index of the variable, we understand 


it as a covariant index instead of a contravariant index because it appears on the 
“denominator” of a partial derivative. 


The following example illustrates some of the subtlety required when working 
with tensor fields in component form. 
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Example 4.5.9. Let B; be the components of a covariant vector field. We prove 
that the collection of functions 


OB; OB, 
I Oxi Oxi 
form the components of a (0,2)-tensor. In the (z, Z?,..., 2") coordinates, we have 


Z OB,  OBe 


Mar! atk 
0 Ox’ 0 Ox 
~ axé & Bi) azk (= Bi) 


O72" Ox’ OB; 0? x4 OxI OB; 
' ork are = aTkaTe-! Ax ATK 


~ axlazk—" 


(4.38) 


Because we sum over variables repeated in superscript and the subscript, the first 
and third terms cancel out. So applying the chain rule on 0B;/0z* and similarly in 
the fourth term, 


— Ox’ OB; Ox" = Ox OB; Ox” 

Ox* Ox" OT =X Ax” Ox* 

_ Oz" 027 0B; Oa! Ox’ OB; 

— O&k OF Oxi ~— OX OE" Axt 
Ox? Oa" 


~ axk axe)" 


by setting u=j and v =1 


We should also observe that the component functions 0B;/0x! do not describe a 
tensor field of type (0,2) because of the mixed second partial derivative that appears 


(4.38). 

PROBLEMS 

4.5.1. Prove that (a) 6)5/5f = 6); (b) d)676% =n. 

4.5.2. Let i Be ada be a tensor of type (r,s). Prove that the quantities Le obtained 
by contracting over the first two indices, form the components of a tensor of type 
(r —1,s—1). Explain in a coordinate-free way why we still obtain a tensor when 
we contract over any superscript and subscript index. 

4.5.3. Let Si;, be the components of a tensor, and suppose they are antisymmetric in 
{i,j}. Find a tensor with components 7;;x that is antisymmetric in j,k satisfying 

—Tijk + Thin = Sijr- 
4.5.4. Prove Proposition 4.5.7. 
4.5.5. Prove that es = filtn 
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4.5.6. 


4.5.7. 


4.5.8. 


Let Ai; be the components of an antisymmetric (i.e., Aj; = —Ai;) tensor field of 
type (0,2), and define the quantities 

OAst te OAtr \ OArs 

Ox” Oxs " Oat | 


Brst = 


(a) Prove that Bs; are the components of a tensor of type (0,3). 
(b) Prove that the components B;s: are antisymmetric in all their indices. 


(c) Determine the number of independent components of antisymmetric tensors 
of type (0,3) over R”. 


(d) Would the quantities B,s: still be the components of a tensor if Aj; were 
symmetric? 


Let a‘, be the components of a (1, 1)-tensor, or in other words the matrix of a linear 
transformation from R” to R” given with respect to some basis. Recall that the 
characteristic equation for the matrix is 


det (a, — X6;) = 0. (4.39) 


Prove that Equation (4.39) is equivalent to 
A” +S (-1)" ad" = 0, 
r=1 


where 
ip 


iyetp ty 
Jr? 


Qn dade Gi 


[Hint: The solutions to Equation (4.39) are the eigenvalues of the matrix (a‘).] 


Moment of Inertia Tensor. Suppose that R? is given a basis that is not necessar- 
ily the standard one. Let gj; be the components of the standard inner product 
corresponding to this basis, which means that the scalar product between two 
(contravariant) vectors A‘ and B? is given by 


= 


In the rest of the problem, call (x', x, x’) the coordinates of the position vector 7. 
Let S be a solid in space with a density function p(7), and suppose that it rotates 
about an axis @ through the origin. The angular velocity vector @ is defined as the 
vector along the axis ¢, pointing in the direction that makes the rotation a right- 
hand corkscrew motion with magnitude w = ||@|| that is equal to the radians per 


second swept out by the motion of rotation. Let (w!,w?,w?) be the components of 
@ in the given basis. The moment of inertia of the solid S about the direction @ is 


defined as the quantity 
t= fff omriav, 
Ss 


where r_, is the distance from a point 7 with coordinates (a1, x, x?) to the axis @. 


The moment of inertia tensor of a solid is often presented using cross products, but 
we define it here using a characterization that is equivalent to the usual definition 
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4.6 


but avoids cross products. We define the moment of inertia tensor as the unique 
(0, 2)-tensor with components J;; such that 

1 pig 

=T,,w'w? = =Iyw”. 4.40 

Q°4 5 £ ( ) 
Note that this the kinetic energy of the rotating object. 


(a) Prove that 
(griw*a!)? 


ri = 4g xa! 
L = Gij ed : 
Jrsw"w* 


(b) Prove that, using the metric g;;, the moment of inertia tensor is given by 


Ij = ae p(x’, 2°, x°) (gis Gui — Gingj)x" a" dV. (4.41) 
Ss 


(9:5 Gut — GikG;1)U"2' = Gipgg deta a’ 
where oF, is the generalized Kronecker delta of order 2. 
(d) Prove that [;; = Ij; for all 1 < i,j <n. 


(e) Prove that if the basis of R® is orthonormal (which means that (g:;) is the 
identity matrix), we recover the following familiar formulas: 


In= [ff ce + (2°)?)dV,  le= = [ff ocx? dV, (4.42) 
Ss 

Inn = iy p((a")? + (x*)) dV, h3= - fff px'x° dV, (4.43) 
Ss Ss 

133 = /I/ p((a")? + (x”)) dV, In3 = - fff px’ x dV. (4.44) 
Ss Ss 


(We took the relation in (4.40) as the defining property of the moment of inertia 
tensor because of the theorem that Ipw is the component of the angular moment 


(c) Show that 


vector along the axis of rotation that is given by (Lijw" ee See [22] p. 221-222 
and, in particular, Equation (9.7) for an explanation. 

The interesting point about this approach is that it avoids the use of an orthonormal 
basis and provides a formula for the moment of inertia tensor when one has an affine 
metric tensor that is not the identity.) 


Symmetric and Alternating Products 


In the tensor product V @ V, in general v, @ vg 4 v2 ® v1. It is sometimes useful to 
have a tensor-like product that is either commutative or anticommutative. 

For example, we have seen that every bilinear form on V is an element of V*@V*. 
However, in geometry, many useful applications involve symmetric bilinear forms. 
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If A;; are the components of an element in A € V* ® V* with respect to a given 
basis on V, then the condition that A be a symmetric bilinear form means that 
Ajj = Aji for all 1 < i,j < dimV. The set of symmetric bilinear forms is a linear 
subspace of V*®?. Other applications may involve a higher type of tensor and have 
symmetry across more than two indices. 

Let V be a vector space of dimension n. Let 5S; be the set of permutations on 
k elements (i.e., bijections on {1,2,...,k}). This set of permutations acts on V®* 
by doing the following on pure tensors: 


o + (U1 @ V2 @+ ++ @ VE) = Ve-1(1) @ Vo-1(2) @ +++ @ Vg-1(k) (4.45) 


and extending by linearity on nonpure tensors. (Taking o~! on the indices means 
that o sends the vector in the ith position in the tensor product v1 ® vg ®--+ @ vy 
to the o(i)th position.) 


Definition 4.6.1. We say that tensor a € V®* is symmetric (resp. antisymmetric) 
ifo-a=a, (resp. ¢-a@ = sign(c)q) for all o € Sk. 
4.6.1 Symmetric Product 


Definition 4.6.2. Let a €¢ V®*. We define the symmetrization of a to be 
S(a) = S- o-Q. 
oES, 


Example 4.6.3. Let V be a vector space. We consider tensors in V ®V ®V. We 
will consider permutations in $3, which has 3! = 6 elements. 


S(e1 @ €2 ® €3) = €1 @ €2 @ €3 + €o Me @ €3 + €3 BQ Wey 
+ €1 © €3 6 €2 + €2 6 €3 B €1 + €3 & €] & Eg. 


In contrast, 


S(e1 @ €1 @ €2) =e, Bei Begte, Ger Beg teg@e, Bey 
+€1Geg@e, +e1 Geir Sei +e2gGe1 B e€1 
= 2(e1 Be] @ eg +e] B eg Be] + 2 Be} BE). 
By construction, the symmetrization S defines a linear transformation S : V®* 
Vee, 
Definition 4.6.4. The subspace of V®" given as the image of S : V®* + V®* is 


called the kth symmetric product of V and is denoted by Sym* V. 


Proposition 4.6.5. The subspace Sym* V is invariant under the action of Sp on 
Ver, 
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Proof. Let r € S; be a permutation. Then on any pure tensor v;, ® vj, ® +++ @ Vi,, 
the action of rT on S(v;, © viz @ +++ @ Vi,) gives 


T-S(uj,, @Uji, @-:-@Ujy,) =T- (x: rm, 8148-8 | 


oeSr 


= S- T+ (7+ U;, @ Yj, @++ @vy,) = S© (70) - Vi, @ Vig @ +++ @ Vix, 


oESk oESk 
= S 0 + Viz @ Vin @ +++ @ Vi, = S(Vi, @ Vig @ +++ @ Vix), 
o'ES, 


where we obtain the second-to-last line because as o runs through all the permu- 
tations in S;, for any fixed r € S;z, the compositions ta also run through all the 
permutations of S;. 


Corollary 4.6.6. For all symmetric tensors a € Sym* V, we have S(a) = kla. 


Proof. This follows immediately from Proposition 4.6.5 and Definition 4.6.2. 


Proposition 4.6.7. Let {e1,€2,...,@n} be a basis of V. Then the set 


{S(e;, @ ei, @-+- Wey, ) | 1 < ty Sig S++ Sip <n} 


is a basis of Sym* V. 


Proof. Define T(k,n) = {(i1,i2,...,i%) € N¥|1 < i) < ig < --- < ix}. For this 
proof, if 2 = (i1,i2,...,in) € {1,2,...,n}*, denote e = €;, ® Gig ® +: @;,. In 
the action of S; on {1,2,...,n}* defined by 


a+ (41, %2,..-,%h) = (i¢-1(1), te-1(2); be etext) 


the set T(k,n) contains exactly one representative from each orbit of this action. 
This implies that {S(e;) |i € T(k,n)} spans ImS = Sym* V. 

Now we show that {S(e;) |i € T(k,n)} is linearly independent. For any o € S; 
and for any i€ T(k,n), the permuted pure tensor o - e; is another pure tensor with 
the same number of e; basis vectors of a given index i. For any j € {1,2,...,n}* let 
g(j) be the same k-tuple j, but reorganized into nondecreasing order. If the k-tuple 
j consists of m, 1s, mz 2s, and so on, define f(j) = m!m2!---m,!. As the above 
examples illustrate, for all i ¢ T(k,n), 


(S)(ex) = f() se ej. 
je{1,....n}* : gG)=i 


Thus, in a linear combination 
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we have c,j) f(j) = 0 for all j € {1,...,n}* because {ej |j € {1,...,n}*} is a basis 

of V®*. However, f(j) > 1, so we deduce that gj) = 0 for all j € {1,... hts Tn 

particular, c; = 0 for all ic T(k,n). This establishes the linear independence. 
We conclude that {S(e;) |i € T(k,n)} is a basis of Sym* V. 


Corollary 4.6.8. Let V be a vector space of dimension n. Then 


dim Sym* V = ee a! 
k 
Proof. From Proposition 4.6.7, dim Sym* V is the cardinality of T(k,n). This par- 
ticular enumeration problem, of counting the number of nondecreasing sequences of 
length & with values in {1,2,...,n}, has a standard solution. Consider n + & slots. 
We have a bag of n Xs and k Ys. Put an X in the first slot. Fill the remaining 
slots with Xs and Ys. Because we insist on an X in the first slot, there are (e) 
ways to fill the slots. However, the set of fillings as described is in bijection with 
our desired set of sequences in the following way. For any filling, let 7, be the num- 
ber of Xs that occur before the t’th Y. With this definition, the resulting k-tuple 
(i1,%2,...,%%) is nondecreasing with 1 < i; <n. (Placing an X in the first slot 
ensured that i; > 1.) Conversely, any nondecreasing sequence (i1,%2,...,%%) leads 
to a unique filling of slots that satisfies our parameters. The result follows. 


Given a € Sym‘ V and 6 € Sym! V, the tensor product a ® G is of course an 
element of V®(*+ but is not necessarily an element of Sym*t'!V. However, it is 
possible to construct a new product that satisfies this deficiency. 


Definition 4.6.9. Let a € Sym” V and 8 € Sym! V. Define the symmetric product 
between a and {3 as 


ab = asia ® B). 


Note that if a and 6 are tensors of rank 1, then the product af is precisely the 
symmetrization of a ® 8. However, a few other properties, which we summarize in 
Proposition 4.6.11, of this symmetric product also hold. We need a lemma first. 


Lemma 4.6.10. Let a € V®*. If S(a) = 0, then S(a@ B) = S(B @ a) = 0 for all 
tensors B. Furthermore, if S(a) = S(a’), then S(a ® 8) = S(a’ ® B) for all tensors 
B. 


Proof. We first prove that if S(a) = 0, then S(a ® @) = 0 for all tensors 6 of rank 
1, and the result for S(G ® a) follows similarly. 

Let S; be the subset of permutations in S,4; that only permute the first k 
elements of {1,2,...,4 +1} and leave the remaining | elements unchanged. Define 
the relation ~ on Sx41 as T, ~ 72 if and only if TT € S;. Since Sz is closed under 
taking inverse functions and composition of functions, it is easy to see that ~ is an 
equivalence relation on S,4,. 
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Let C be a set of representatives of distinct equivalence classes of ~. Then we 
have 


S(a ® B) = Soo -(a®B)= Ss Sa (a ® B) 


oESk+1 TEC o'ES, 
“Sr (FE o-9) 98) -FErisweoo 
TEC a'ESk 


For the second part of the lemma, suppose that S(a) = S(a’). Then S(a—a’) = 
0. Thus, for all tensors 3 we have S((a—a’)@) = 0. Hence, S(a®B)—S(a’®@Z) =0 
and the result follows. 


Proposition 4.6.11. Let V be a vector space of dimension n. The following hold: 


1. The symmetric product is bilinear: for all a,a,,a2 € Sym*V, for all 
B, B1, 82 € Sym! V, and X in the base field, 


(ay +.a2)8 = a8 + 028, (Aa) B = (a8), 
a(81 + B2) = afi + ao, a(AB) = A(aB). 


2. The symmetric product is commutative: for alla € Sym’ V and 8 € Sym'V, 
aB = Ba. 


3. The symmetric product is associative: for alla € Sym" V, 8 € Sym*°V, and 
y € Sym'V, as an element of Sym"***' V, we have 


1 
viet S(a®B@y). 


(aB)y = a(87) = 
Proof. We leave part 1 of the proposition as an exercise for the reader. 

For part 2, by Proposition 4.6.5, S(a@ @ @) is invariant under the action of $;41. 
Consider the permutation o9 € Sz4; that maps the n-tuple (1,2,...,k +1) to 
(k+1,...,k+1,1,...,%). In each pure tensor in an expression of a @ (, the action 
oo(a@ ® 8) moves (and keeps in the proper order) the vector terms coming from ( 
in front of the terms coming from a. Hence, we see that ao(a ® 8) = B @a. Thus, 
we conclude that 


Ba = o0(a8) = 00 ( g795(0@8)) = pr _S(a.@ 8) = a8. 


Thus, the symmetric product is commutative. 
For part 3, by Corollary 4.6.6, since a is symmetric, 


S(aB) = (r+ s)!aB = PFN sae B). 
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Therefore, by Lemma 4.6.10, for all tensors 7 of rank t, 


r+s)! 
(ager) = 5 (F9 (a8 8) 7). 
Consequently, 
(a8)y = —_S(a8 @7) = —_ "+ 's((a.@ 8) @ 7) 
is 1 +s) een ~ (r+s)it! ris! - ss 
1 
Far ee 


It is easy to follow the same calculation and find that 


a(By) = S(a®6B@7), 


rl sit! 


which shows that (af) = a((7) for all tensors a, 3, and y. 


By virtue of associativity, the symmetrization of a pure tensor S(v1 @v2®-- -@ug) 
is in fact 


Uli VU2°** Uk. 


We think of this element as a commutative “product” between vectors, which is 
linear in each term. With this notation in mind, one usually thinks of Sym* V asa 
vector space in its own right, independent of V®*, with basis 


{€i, Cin ++ Ci, | L<iy Sig S++ Ste < nh. 


Furthermore, analogous to polynomials in multiple variables where the monomial 
ryx?23y = xy?z3, any symmetric product vector e;,¢;, +++ e;, is equal to another 


expression on which the particular vectors in the product are permuted. 


4.6.2 Alternating Product 


We turn now to the alternating product, also called the wedge product. Many of 
the results for the alternating product parallel the symmetric product. 

Let V be a vector space of dimension n and let us continue to consider the action 
of S, on V®* as described in Equation (4.45). Recall the sign of a permutation 
described in Definition 4.5.6. 


Definition 4.6.12. Let a € V®* be a tensor. We define the alternation of a to be 


A(a) = S© sign(a)(o- a). 


o€Sk 
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Example 4.6.13. Let V be a vector space. We consider tensors in V ®V @® V. 
We will consider permutations in $3, which has 3! = 6 elements. The identity 
permutation has a sign of 1, permutations that interchange only two elements have 
a sign of —1, and the permutations that cycle through the three indices have a sign 
of 1. 


A(e1 @ €2 @ €3) = €1 ® €2 ® €3 — €2 Bi Bez — €3 WEQH Eq 
— €1 6 €3 © €2 + e€2 & €3 © €1 + €3 & €1 & €2 


In contrast, 


A(e1 @ €1 @ €2) = €1 @ €1 Beg — €1 BE BEQ—E2 QE Bey 
—€1 @e€2 Ge, +€1 Se, Bei +€2 Se, Bex =O 


Proposition 4.6.14. Let v, @v2@---@vz be a pure tensor in V®*. If vu; =v; for 
some pair (1,7), wheret #7, then 


A(v1 @ v2 @ +++ @ ug) = 0. 


Proof. Suppose that in the pure tensor v1 © v2 ®---@ vz, we have v; = v; for some 
pair i #7. Let f € S, be the permutation that interchanges the ith and jth entry 
and leaves all others fixed. This permutation f is a transposition so sign(f) = —1. 
Define the relation ~ on S, by o ~ 7 if and only if r~'o € {1, f}. Note that 
f? = fof =1 is the identity permutation, and hence, f = f~!. Because of these 
properties of f, we can easily check that the relation ~ is an equivalence relation 
on Sk. 

Let C' be a set of representatives for all of the equivalence classes of ~. If 
A= V1 ®v2@-+:@ up, then f-a=a because vj = v;. Thus, 


A(a) = So (sign(o)(o- a) + sign(o f)((of) -)) 


oEC 

= S- (sign(c)(o - a) — sign(a)(o - (f - @))) 
oEC 

= S- (sign(c)(o -a) —sign(a)(o- a)) -0 
oEC 


By construction, the alternation A defines a linear transformation A : V®* > 
Vek, 


Definition 4.6.15. The subspace of V®* given as the image of A: V®* — V®F is 
called the kth alternating product or the kth wedge product of V and is denoted by 


wie 
Proposition 4.6.16. The subspace Ae V is skew-invariant under the action of Sz 
onV®*, i.e., for allo € S, and for all tensors a € A" V, we have o-a = sign(c)a. 
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Proof. (Left as an exercise for the reader.) 


Corollary 4.6.17. For all alternating tensors a € A“ V, we have A(a) = kha. 


Proof. By Proposition 4.6.16, 


A(a) = S- sign(a)o-a= ye sign(c)?a = kha. 


oES, oES, 


Proposition 4.6.18. Let {e1,¢2,...,@n} be a basis of V. Then the set 
{A(ei, ® ej, @ ++: @ej,)| 1 < ty < tg < +++ < ip <n} 
is a basis of \*V. 


Proof. (The proof of this proposition is similar to the proof of Proposition 4.6.7 
with the exception that A(e;, ® e;, @---® e;,) = 0 if is = i; for some s 4 t. We 
leave the proof as an exercise.) 


Corollary 4.6.19. Let V be a vector space of dimension n. Then 


an Ave or 


Proof. The proof of this corollary is similar to that of Corollary 4.6.8. However, we 
need to devise a counting argument that enumerates all strictly increasing sequences 
of length k with entries in {1,2,...,n}. Consider the scenario where we have n slots 
and a bag of n—k Xs and k Ys. There are (7) ways to fill n slots with the Xs and 
Ys, by choosing the slots in which we put Ys. However, there is a bijection between 
such fillings and our desired set of increasing sequences. For a given filling, define 7, 
as the number of Xs or Ys before or including the tth Y. This is clearly an increasing 
sequence with entries in {1,2,...,n}. Furthermore, any such increasing sequence 
gives us a unique filling of the slots with Xs and Ys. The result follows. 


As in the case of the symmetric product, it is not hard to see that the tensor 
product of alternating tensors is, in general, not another alternating tensor. How- 
ever, it is possible to define a product between alternating tensors that produces 
another alternating tensor. 


Definition 4.6.20. Let V be a vector space, and let a € A‘ Vand Be N V. We 
define 


aAB = aAla@ 8) 


so that aN BE rs a V. We call this operation a A 6 the exterior product or the 
wedge product of a and £. 


Similar properties hold for the exterior product as for the symmetric product. 
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Proposition 4.6.21. Let V be a vector space of dimension n. The following hold: 


1. The exterior product is bilinear: for all a,a,,a2 € AC V, 6,01, 82 € N' V, 
and X in the base field, 


(a, +a2)AB=ayAB+a2A 8B, (Aa) A B = X(aA 8B), 
A (Bi + B2) =aA Bi +a Bo, a A (AB) = A(a@A B). 


2. The exterior product is anticommutative in the sense that for all a € Ns V 
and BE [\\V, 
BNa=(-1) aa B. 


3. The exterior product is associative: for allace \'V,BE AV, andy € \ V, 
as an element of poeare V, we have 


(aA B)Ay=aN(BAY) = A(a®B@7). 


1 
r! sit! 


Proof. Again we leave part 1 as an exercise for the reader. 

For part 2, by Proposition 4.6.16, A(a@ @ 8) is skew-invariant under the action 
of S,4;. As in the proof of Proposition 4.6.11, consider the permutation og € S41 
that maps the n-tuple (1,2,...,k +1) to (k+1,...,k+1,1,...,k). In each pure 
tensor in an expression of a @ 8, the action oo - (a ® 8) moves (and keeps in the 
proper order) the vector terms coming from ( in front of the terms coming from 
a. Hence we see that oo: (a @ 8) = B@a. Also, it is not difficult to see how ao 
can be expressed using kl transpositions (permutations that interchange only two 
elements), and therefore, sign(ao) = (—1)*!. Thus, we conclude that 


BAa= FAG @a)= GAA: (a@8)) 


g l! 
= sign(o0) aA (a®@B)= (= iF la A B. 
Part 3 follows in a similar manner to the proof of Proposition 4.6.11 with appro- 


priate modifications, including an adaptation of Lemma 4.6.10 and using Corollary 
4.6.17. 


By virtue of the associativity of the exterior product, the alternation of a pure 
tensor A(v1 @ ve ® ++: @ vz) is denoted by 


vy \ v2 A+++A UR, 


where we often think of this element as an anticommutative “product” between 
vectors. This means that 


V1 Ave A+++ AUER = —V1 Ave A+++ AUR (4.46) 


interchange i,j 
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and also that for all 0 € Sx, 
a: (vp Avg A+++ A vg) = sign(a)u1 Avg A+++ AUR. (4.47) 


With this notation in mind, we often think of We V as a vector space in its own 
right, independent of V®*, with basis 


{ei;, Nei, A+++ A Ci, | Lip <tg <1 < ip <n}. 


Example 4.6.22. Let V = R®, and let 


1 3 
v= |{-1 and w= |0 
2 2 


be vectors in R3. Recall that dim iG V = 3. With respect to the standard basis in 
A’ V, we have 


TA W = (& — & +283) A (32, + 263) 
= 3€) A &, + 2€) A €3 — 3&5 A € — 2&5 A €3 + 663 A E + 463 A €3 
= 2) A &3 + 3€1 A €o — 2&2 A &3 — GE] A €3 


= —4€) A €3 + 3€) A €2 — 2€2 A €3. 


For the symmetric product, recall that dimSym?V = 6. With respect to the 
standard basis in Sym? V, we have 


Proposition 4.6.23. Let V be an n-dimensional vector space over a field. Let v; 
fori =1,...,m bem vectors in V wherem <n. Let w; forj =1,...,m be another 
set of vectors, with w; € Span(v;) given by w; = do, cjivi. Then 


wi Awe A++-A Wm = (det cji)u1 Ava A+++ A Um: 
Proof. This is a simple matter of calculation, as follows: 
m m m 
wi Awe: ‘ ‘\Wm = (s aim) \ @ cna) Av+-A ( S- cnn (4.48) 
iy=1 ig=1 im=1 


In any wedge product, if there is a repeated vector, the wedge product is 0. There- 
fore, when distributing out the m summations, the only nonzero terms are those 
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in which all the #1, %2,...,%m are distinct. Furthermore, by Equation (4.47), any 
nonzero term can be rewritten as 


U5, A Vig No AU;,, = sign(a) vy Avg A+++ A Um, 


where o is the permutation given as a table by 


& Dye seis 5) 
C= |. 3 ‘ ; 
ay 12 eee tm 


Furthermore, by selecting which integer is chosen for each 7, in each term on the 
right side of Equation (4.48), we see that every possible permutation is used exactly 
once. Thus, we have 


wy \weA\-++-AWm = Dy sign(o)C1¢-1(1)C20-1(2) *** Cma-1(m) | V1 A v2 A+++ A Um. 


TeSm 


The content of the parantheses in the above equation is precisely the determinant 
of the matrix (c;;) and the proposition follows. 


Example 4.6.24. Let V = R” with standard basis e;, where i = 1,2,...,n. By 
Proposition 4.6.23, we have 


Uy Ag A+++ AG, = det | 0p Go +++ ty | EL ANE2A--- ANE. 
| | | 
By a standard result, the determinant det (wv Uq cc Tn) is the volume of the 
parallelepiped spanned by {t,2,--- , Un}. 


Furthermore, if we consider the element é*'A---Aé" € (\" V* as an alternating 
multilinear function on V, we have 


STA ASG, ...,0n) = (x: Hen(o)o(@"6--0e")) (v1,..., Un) 


o€S: 
| | 


= det vy Uo oes On 


Therefore, the element é*! A --- A é*” is often called the signed volume form on V. 
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PROBLEMS 


4.6.1. Let V = R®, and consider the linear transformation T : V > V given by 
Ls 2) 33 
T(@w)=|4 5 6% 
7 8 9 


with respect to the standard basis of R?. 


(a) Prove that the function S: \? V > A? V that satisfies 
S(t A U2) = T (v1) A T (¥2) 


extends to a linear transformation. 
(b) Determine the matrix of S with respect to the associated basis {71 A 7, 7A 
kk At. 
4.6.2. Repeat the above exercise but with Sym? V and changing the question accordingly. 
4.6.3. Prove Proposition 4.6.16. 
4.6.4. Prove Proposition 4.6.18. 
4.6.5. Prove part 1 of Proposition 4.6.11. 


4.6.6. Let V be a vector space over C of dimension n, and let T be a linear transformation 
T:V > V with eigenvalues \;, where 1 <i <n. Let S: A?V > A?V be defined 
by S(v1 \ v2) => T(v1) \ T (v2). 

(a) Prove that the eigenvalues of S are \;A; for 1 < i<j <n. 
(b) Prove that det S = (det T)"~1. 


(c) Prove that the trace of S is 


4.6.7. Let V be a vector space over C of dimension n, and let T be a linear transformation 
T:V —V with eigenvalues \;, where 1 <i <n. Let S : Sym?V — Sym? V be 
defined by S(viv2) = T(v1)T (v2). 

(a) Prove that the eigenvalues of S are \;A; for 1 < i<j <n. 
(b) Prove that det S = (det T)”*". 
(c) Prove that the trace of S is 


3 ((S2a)"+ 398). 


i=1 
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4.7 Algebra over a Field 


We conclude this chapter by introducing the concept of an algebra over a field. 
If the reader is not familiar with the technical term of “algebra,” she has already 
encountered this algebraic structure both in this book and in previous study. The 
reader surely is familiar with the word “algebra” used in a variety of contexts; the 
precise definition for an algebra aligns with the casual use of the word. Furthermore, 
introducing this notion here allows us to give a broader perspective on multilinear 
algebra. In addition, we present the concept of a derivation, which plays a central 
role in analysis on manifolds. 


4.7.1 Algebras 


Definition 4.7.1. Let K bea field. An algebra over K is a vector space A over K 
equipped with a bilinear transformation A x A — A. The bilinear transformation 
is usually called a product. 


It is not uncommon to change the terminology slightly and refer to an algebra 
on A. We say that an algebra is commutative (resp. associative) depending on 
whether the product is commutative (resp. associative). Note that the bilinear 
property implies that the product distributes over the addition. 

A few common vectors spaces are in fact algebras. We consider a few examples. 


Example 4.7.2. One of the first nontrivial examples of an algebra that mathemat- 
ics students encounter is the vector space of R* equipped with the cross product. 
The properties that 


for all vv, € R? and for all c € R establish that the cross product x is a bilinear 
transformation. This algebra is neither commutative nor associative. 


Example 4.7.3. In linear algebra, the operations of addition and scalar multipli- 
cation on the set M,,(R) of n x n matrices with coefficients in R makes it a vector 
space. However, the operation of matrix multiplication on M,,(R) is bilinear in each 
entry, giving M,,(IR) the stucture of an algebra. This algebra is not commutative 
but it is associative. 


Example 4.7.4. The set of polynomials of degree n or less, or more generally the 
set R{a] of all polynomials with coefficients in R, equipped with scalar multiplica- 
tion, addition, polynomial multiplication, is an algebra. That is why the expression 
“polynomial algebra” makes sense. Polynomial algebra over a field is both commu- 
tative and associative. 


Example 4.7.5. Let J be an interval of R. The set of continuous function C°(I,R), 
and more generally the set of functions of any differentiability class, forms a vector 
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space when we consider scalar multiplication and addition of functions. However, 
by virtue of distributivity, multiplication of functions equips these sets with the 
structures of an algebra over R. 


The concept of an algebra allows us to recast some of our constructions con- 
cerning tensor products into a broader perspective that will be useful later on. We 
introduce the tensor, symmetric and alternating algebra on a vector space in tan- 
dem. Let V be a vector space over a field K. In all the following cases, the product 
of an element in K with anything else corresponds to scalar multiplicaiton. 


1. The tensor algebra on V, denoted T°V is the infinite direct sum 


TV=QDV" =KeVe(VEV)EVeEVeEV)e-- 
j=0 


with the bilinear product on T*V induced from @ : V®* x V®t 4 V@(s+4) 
and extended by linearity. 


2. The symmetric algebra on V, denoted Sym V, is the infinite direct sum 
SymV = Q@) Sym V = K 6@V @Sym’V @ Sym*V e--- 
j=0 


with the bilinear product on Sym® V induced from - : Sym* V x Sym! V > 
Sym*** described in Definition 4.6.9 and extended by linearity. 


3. The alternating algebra on V, denoted (AV, is the infinite direct sum 


Av=@Av-K pVa Av Av Dees 


with the bilinear product on /\° V induced from the exterior product (Defini- 
tion 4.6.20) \: A°V x A'V > Ao V and extended by linearity. 


As long as V is a nontrivial vector space, T*V and SymV are infinite dimen- 
sional. However, if V is finite dimensional with dim V = n, then A‘ V is trivial for 


k>n and 
n 7 n 
dim \V = S-dimA\ Vv => @ = 9", 
j=0 j=0 \J 


The tensor, symmetric and alternating algebras on a vector space are associative. 
However, only the symmetric algebra is commutative. 


Example 4.7.6. As an example of operations in T*V, suppose that V = R® with 
basis {7,7,k}. Let a = 4+ 27— 37@&k and let 6 = 7+37—k. The addition of a and 
B on T°V, after collecting like terms, is 


a+B=11+5t—k-37@k. 
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The product follows from distributivity by 


a® B = (44 27- 37@ k) @ (7 + 37— k) 

= 28 + 127— 4k + 147+ 67@7-WKE-UT@E 
-~IW@QK@T+IT@OR@E 
= 28 + 267— 4k + 67 @T— 237RK—WM@OK@T+IT@K@K. 


The tensor, symmetric and alternating algebras associated to a vector space are 
examples of graded algebras, graded by N. 


Definition 4.7.7. An N-graded algebra is a vector space expressed as 


V=Q\; 


jeEN 


where V; is a vector space for all 7 € N and in which, for and j,k € N, the product 
- has V;- Ve C Vir. 


4.7.2 Generating Subsets 


Recall that in linear algebra, if S is a nonempty subset of a vector space V, the 
span of S consists of all linear combinations of elements in S', namely, 


Span(S) = {cyu, + cog +--+: + Cnn |n € N*,c1,...,¢n € K, and w,...,uUn € SH. 


It is an easy exercise in linear algebra to show that for any nonempty set S, the set 
Span(S) is a subspace of V. We say that S spans V is Span(S) = V. 

In contrast for algebras, if S is a subset of an algebra A, we define the subset 
of A generated by S as the smallest subset T of A that contains S, is closed under 
multiplication by any scalar, closed under addition, and closed under the algebra 
product. By distributivity of scalar multiplication, associativity of addition, and 
distributivity of the product, the subset generated by S is a subalgebra of A. We 
say that A is generated by the subset S if the subalgebra generated by S is all of 
A. 


Example 4.7.8. Consider the set K[] of all polynomials of scalars from a field 
kK and consider the subset {1,2}. Using the product of x with itself produces the 
infinite set {x,x?,2°,...}. By taking any finite sum of scalar multiples of elements 
in {1,z,2?,...} gives every polynomial. Consequently, {1,2} generates K[z] as an 
algebra. 


Example 4.7.9. Let V be a vector spaces with basis {u1,u2,...,Un}. It is not 
hard to see that, using their respective products, the tensor algebra, the symmetric 
algebra and the alternating algebra on V are all generated by {1, ui, u2,..., Un}. 


4.7. Algebra over a Field 


175 


4.7.3. Derivations 


Among the examples of algebras that we presented above, consider the algebra 
C(I, R) of differentiable real-valued functions over an interval I of R. The deriva- 
tive operator D on C™(J,R) is a linear transformation. The derivative of a product 
is not the product of derivatives but the derivative satisfies the product rule, also 
called Leibniz’s law. 


Definition 4.7.10. Let A be an algebra over a field kK. A derivation on A is a 
linear transformation D: A > A that satisfies Leibniz’s law, 


D(ab) = D(a)b + aD(b), for all a,b € A. 
The set of all derivations on A is denoted by Der, (A). 


Example 4.7.11. Let K[z] be the polynomial algebra. Define D : K[z] > K{z] 
by 

Dane” +--+ + aye + ap) = nage” + (n— Danie”) +++: + aye. 
We recognize this as x times the derivative of the polynomial. We prove directly 
that this is a derivation. 

It is easy to see that D is a linear transformation. We need to check the Leibniz 
rule. Let a(a) = ana" +-+-++ a," +a and b(x) = bma™ +--+ +b1a+ bo. Assuming 
a, = 0 if <0 ori > 7 and similarly for the coefficients of b(x), then we can write 
the product as 


m+n 
a(x)b(a) = S- s aib; | a* 
k=0 \itj=k 


So 


m+n 
S> kazbj | e* = S7 | So G+ s)aid; | 2* 
k=1 


itj=k =1 \it+j=k 


S- tayb; ok 4 S- jab; ak 


i+j=k k=1 \itj=k 


I 
is 


This shows that D is a derivation. 


Proposition 4.7.12. Suppose that an algebra A is generated by a subset S. Then 
two derivations D, and Dz are equal if and only if they agree on all elements of S. 


Proof. By Problem 4.7.8, the set of derivations Der, (A) is a subspace of Hom x (A, A). 
In particular D, — D2 is a derivation. So D, and D2 are equal as derivations if and 
only if D = D, — Dz is the trivial function. So it suffices to prove that a derivation 
is trivial if and only if it maps all elements of S' to 0. 
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If D is trivial, it obviously maps all elements of S to 0. We need to prove the 
converse. Suppose that D maps all elements of the subset S to 0. For all a € A, 
0a = 0 = a0, where 0 is the zero vector of A. So 


Yue S,Vee K, D(cv) =cD(v) = 0 =0, 
Vu,ue S, D(u+v) = D(u) + D(v) =04+0=0, 
VYu,v€ S, D(uv) = D(u)jv + uD(v) = 0v + u0 = 0. 
Since D is trivial on all S and since having a 0 derivation is preserved with the 


three operations that define A recursively from S, then D is trivial on all of A. The 
proposition follows. 


PROBLEMS 


4.7.1. Let K be a field and let M,,(K) be the set of n x n matrices with coefficients in 
K. Define the bracket operation on M,(K) by [A, B] = AB — BA. 


(a) Prove that M,(K) equipped with [, ] is an algebra. 


(b) Prove that this algebra is neither commutative nor associative. 


4.7.2. Prove the set of bilinear products on V is a vector space and show a canonical 
isomorphism between the set of bilinear products on V and VV @V* @V*. 


4.7.3. Let V be a vector space over a field K. Prove that the direct sum of all tensor 
products of type (r,s), 


co 0080 


Bev’ av’, — where V° = K and K@K =K, 


r=0 s=0 
is an algebra with the usual tensor product ® as the bilinear product. 


4.7.4. Let V be a 2-dimensional vector space of K with basis {e1,e2}. Suppose that 
S:V—V isa linear transformation. Define \S: \V > AV as 


(A S\() +a +e? + ce: Aes) =c' +S(e1) +P S(e2) +c S(e1) A S(e2). 


(a) Prove that AS is a linear transformation. 
(b) Suppose that with respect to the ordered basis (e1,e2) on V, the matrix 
of S is @ A Find the matrix of AS with respect to the ordered basis 
(1, e€1,€2,e1 Ae2) on AV. 
4.7.5. Repeat the previous exercise assuming that V is a vector space of dimension 3. 


4.7.6. Let I be an interval of R and consider the vector space C(I, R). Show that the 
second derivative D? is not a derivation. 


4.7.7. Let I be an interval of R. Consider the vector space C°°(J,R) and let D be the 
derivative operator on C™(I,R). Let g: I > I be a differentiable function. 
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(a) Prove that D, : C°(I,R) —~ C®(I,R) defined by Di(f) = g- D(f) isa 
derivation. 

(b) Explain why D2: C®(I,R) — C®°(I,R) defined by Do(f) = D(f og) is not 
a derivation. 


4.7.8. Prove that Derx (A) is a vector subspace of Homx (A, A). 


4.7.9. Let A be an algebra. Prove that Derx(A) is an algebra when equipped with the 
bilinear transformation 


[D1, D2] = Dy, ie} Do — Do e) Dy. 


[Hint: Use Problem 4.7.8.] 
4.7.10. Let U be an open subset of R” and consider the algebra C™ (U, R) of smooth func- 


tions on U. Let a;(x1,%2,...,%n) be smooth functions over U for i = 1,2,...,n. 
Prove that 
O O 
a1(X1,22,... naz +--+ +4n(x1,22,.. tn) a 


is a derivation on C®(U,R). 


Taylor & Francis 
Taylor & Francis Group 
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CHAPTER 5 


Analysis on Manifolds 


In Chapter 3, we introduced the concept of a differentiable manifold as motivated by 
a search for topological spaces over which it is possible to do calculus and ultimately 
dynamics. The idea of having a topological space locally homeomorphic to R” drove 
the definition of a differentiable manifold. Subsequent sections in that chapter 
discussed differentiable maps between manifolds and the differentials of such maps. 
We used these to introduce the important notions of immersions, submersions, and 
submanifolds as qualifiers of how manifolds may relate to one another. 

The astute reader might observe that we have not so far made good on our 
promise to do physics on a manifold, no matter how amorphous that expression may 
be. As an illustrative example, consider Newton’s second law of motion applied to, 
say, simple gravity, as follows: 


mz" (t) = mg, (5.1) 


where m is constant and g is a constant vector. the parametrized curve Z(t) in R® 
is called the trajectory and its acceleration vector #”(t) is also a vector function 
in R°. In order for (5.1) to have meaning, it is essential that the quantities on 
both sides of the equation exist in the same Euclidean space. Applying this type of 
equation to the context of manifolds poses a variety of difficulties. 

First, note that a curve in a manifold M is a submanifold y : I > M, where I 
is an open interval of R, whereas the velocity vector of a curve at a point p is an 
element of the tangent plane to M at p. Second, the discussion of differentials in 
Chapter 3 does not readily extend to a concept of second derivatives for a curve in 
a manifold. It is not even obvious in what space a second derivative would exist. 
Consequently, it is not at all obvious how to transcribe equations of curves in R® 
that involve #, #’, and #” to the context of manifolds. 

Another difficulty arises when we try to express in the context of differentiable 
manifolds the classical local theory of surfaces in R? (as presented in [5, Chapter 5]). 
It is not difficult to define the first fundamental form as a bilinear form on T,M. 
However, since we do not view a given manifold M as a subset of any Euclidean 
(vector) space, the concept of normal vectors does not exist. Therefore, there is no 
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equivalent of the second fundamental form, and all concepts of curvature become 
problematic to define (see Chapter 6 in [5}). 

This chapter does not yet discuss how to do physics on a manifold, but it does 
begin to show how to do calculus. We study in greater detail the relationship be- 
tween the tangent space to a manifold M at p. Also, in order to overcome the 
conceptual hurdles mentioned above, we introduce the formalism of vector bun- 
dles on a manifold, discuss vector (and tensor) fields on the manifold, develop the 
calculus of differential forms, and end by considering integration on manifolds. 

In Chapter 4 we commented how geometers and physicists both use tensors but 
usually with very different notations (usually called coordinate-free or coordinate- 
dependent). This difference continues here as we use tensor fields on manifolds. If a 
reader is already familiar with one or the other habits of notation, it is very useful 
to recognize both as representing the same kind of object. However, we must begin 
by introducing the vector bundle formalism. 


5.1 Vector Bundles on Manifolds 


A vector bundle over a manifold is a particular case of a fiber bundle over a topo- 
logical space. As we do not need the full generality of fiber bundles in this book, we 
refer the interested reader to [53] or [12] and present instead the specific formalism 
of vector bundles. 

Chapter 3 discussed tangent spaces to manifolds. To each point p € M, we 
associated a tangent space. The elements of the tangent space are differential op- 
erators of differentiable functions f : M — R. Despite their slightly more abstract 
definition, such differential operators properly model the role of tangent vectors. 
Since M is not a subset of some Euclidean space, the tangent spaces T,/M are not 
subspaces of any ambient space either. A manifold equipped with tangent spaces 
at each point motivates the idea of “attaching” a vector space to each point p of a 
manifold M. Furthermore, from an intuitive perspective, we would like to attach 
these vector spaces, in some sense, continuously. The following definition formalizes 
this perspective. 


Definition 5.1.1. Let M” be a differentiable manifold with atlas A = 
{(Ua, ¢a)}aer, and let V be a finite-dimensional, real, vector space. A vector bundle 
over M of fiber V is a Hausdorff topological space E with a continuous surjection 
a: E — M (called a bundle projection) and a collection VY of homeomorphisms 
(called trivializations) Wo :Ua x V > 7~1(Uq), satisfying 


1. Tota (p,v) = p for all (p,v) € Ua x V and E, e m~1(p) is homeomorphic 
to V; 


2. if UaNnUs #0, then 5! oda : (UaNnUg) x V 4 (UaUs) x V is of the form 


Wg 0 Da(p, v) = (p, Oga(p)v), 
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where @gq(p) : Ua AUg — GL(V) is a continuous map into the general linear 
group (i.e., the set of invertible transformations from V to V). 
The vector bundle is called differentiable (respectively, C* or smooth) if M is dif- 
ferentiable (respectively, C* or smooth) and if all the maps @gq are differentiable 
(respectively, C*’ or smooth) as maps between manifolds. 


We point out that when V is a finite dimensional vector space of dimension n, 
we can identify GL(V) as an open subset of the set R”™. Hence, GL(V) naturally 
carries the structure of a differentiable manifold. It is in this sense that the functions 
634 can be differentiable maps between manifolds. 

A vector bundle whose fiber is one-dimensional is called a line bundle. 

Vector bundles are often denoted by a single Greek letter € or 7. The topological 
space E£ is called the total space and denoted by E(€) while the manifold M is called 
the base space and denoted by B(&). 


Example 5.1.2 (The Trivial Bundle). Let M” be a manifold with atlas A = {da}, 
and let V be a real vector space. The topological space M x V is a vector bundle 
over M. The trivialization maps w, are all the identity maps on U, x V and the 
maps 9gq are the identity linear transformation. 


Example 5.1.3 (Infinite Mébius Strip). Consider the circle S' as a manifold with 
the atlas {(Ui, 61), (U2, ¢2)} defined by: 


$1 : U, =S' — {(1,0)} > (0,27), with ¢;(cos u, sin u) = u, 
do : U2 =S' — {(—1,0)} > (x, 37), with ¢2(cos u, sin u) = u. 


So ¢, uses as a coordinate the angle around S! from (1,0), while ¢2 also uses as a 
coordinate the angle around S! starting (1,0) but, with the value of the angle taken 
in (7,37). The transition map between these two charts is 


o21 = $20," : (0, 7) U (a, 277) > (a, 27) U (27, 377) 
ut2ar if0<u<aq, 
Uh 
U ifm@<u< 2n. 


Now define the vector bundle € of fiber R over S! as a total space E with the 
surjective map 7: E — S! defined by homeomorphisms 7; : U; x R + 7~1(U;) for 
i= 1,2, such that Wy! oy : (U1 VU2) x R > (U; NU2) x R is given by 


wy! ov1(p, v) = (p, O21 (p)(v)) 


where 


= —l, if 0 < gi(p) <7, 
921 (P) = ff if t < b1(p) < 2r. ee) 
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Note that 021 : UjNU2 — GL(R) = R—{0} is constant on the connected components 
of U; NU2z = S! — {(1,0), (—1,0)} and hence is continuous and also smooth. The 
function 612 is the inverse function of 02; so has the same properties. 

We will show that the image M of the parametrized surface in R* described by 


Y(u, t) = (cosw, sin u, cos (5) ,tsin (5)) , with (u,t) € [0,27] x R 


realizes this vector bundle €. The function 7 : M — S! defined by projection 
onto the first two coordinates in R*. Note that above each point p € S!, ie., the 
points of 7~!(p) are lines in R*. Furthermore, it is not hard to see that 7~1(U;) is 
homeomorphic to U; x R for 7 = 1,2, each of which we can visualize as an infinite 
strip. If we define 7; : U; x R > 7~1(U;) as 


wi(p, t) = Y(d:(p), t), 
then 
wo(p,t) = Y(b2(p), t) = ¥(¢21(1()), t)- 


Since @ga(u) is always w+ 27k for k € {—1,0,1} and cosa and sin are periodic 
27, then 


m(Wi(p, t)) = m(¥ (6;(p), t)) = p. 
Furthermore, if 0 < ¢1(p) <7, 


wo(p, t) = Y(b1(p) + 27, t) 
= (cos(¢1(p)), sin(¢i(p)), tcos(¢1(p) + 7), tsin(d1(p) + 7)) 
= (cos(¢1(p)), sin(¢1(p)), —t cos(¢1(p)), —t sin(¢1(p))) = Yi(p, —t), 


while, if 7 < ¢1(p) < 27, we have wo(p,t) = Y(¢1(p), t) = v1 (p,t). Consequently, 


wo 1 (p,t) = (p, 21 (p)(t)) 


for the transition functions 62; defined in (5.2). 
The subset M is evidently not the cylinder S' x R. Furthermore, one can get 
an intuition for this set as a Mobius band of infinite width. 


The intuitive stance behind Definition 5.1.1 is that a vector bundle is not just a 
manifold with a vector space V associated to each point but that the vector spaces 
“vary continuously.” 

Consider now a differentiable manifold, and consider also the disjoint union of 
all the tangent planes to M at points p € M, ie., 


[] %™ ={0, X) |p € M and X € T,M}. 
pEM 


The identity map 7: M — M is certainly differentiable, and we calculate its dif- 
ferential at a point p € U, M Ug in overlapping charts. Label the coordinate charts 
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x = ¢q and = ¢g. According to (3.13), the matrix of the differential of the 


identity map is 
[di,] 7, & ) ’ 
Ox? p 

and the reader should recall that the explicit meaning of this partial derivative is 
given in (3.14). Given any pair of overlapping coordinate charts, this differential is 
invertible so it is an element in GL,(R) and corresponds to the maps Oza. 

We can arrive at this same result in another way. Consider the coordinate 
systems defined by % and x over U,. Ug. The chain rule gives, as operators, 


(6) "Ox! 0 
pail asl 


7 
i=l pOx 


(5.3) 


Pp 
(The subscript |, becomes tedious and so in the remaining paragraphs, we under- 
stand the differential operators and the matrices as depending on p € M.) Recall 
that by the chain rule (0x'/Oz/) and (0% /Oz") are inverse matrices to each other, 
so, in particular, 

“. Oat Oz" k 


where oF is the Kronecker delta. Note that (5.4) follows from (5.3) by applying 
0/0Z) to z*. 

Let X € TM be a vector in the tangent space. Suppose that the vector X has 
coordinates a/ in the basis (0/01) and coordinates @ in the basis (0/07). Using 
Einstein summation convention, we have X = a@10/0Z/. Then 


xa (55 | =a! a so a’ gi OF 


x! ote eae 


O 
Multiplying by 5at and summing over 7, we obtain 
ze 


— Ox) Ox? 


dat” ORI Oat — 
This leads to the change-of-coordinates formula 
se os ax ; 
Ox* 
The above calculations are important in their own right, but in terms of vector 
bundles, they lead to the following proposition. 


Proposition 5.1.4. Let M” be a differentiable manifold. The disjoint union of all 
the tangent planes to M 
I] =™. 


pEeM 
is a vector bundle with fiber R” over M. 


Dore OO” (35 =) ORF 


i (5.5) 
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Proof. An element of TM is of the form (p, Xp), where p € M and X, € T,M. 
We point out first that the bundle projection 7 : TM — M is simply the function 
m(p, Xp) = p. 

We have already seen that for each p € M, the matrix (az* / az'|,,) is invertible. 
It remains to be verified that this matrix varies continuously in p € M over U.NUs,, 
where z is the coordinate system over U, and Z is the coordinate system over Ug. 
However, (az* / ax'|,,) is the matrix of the differential of Zo z~' and the fact that 


this is continuous is part of the definition of a differentiable manifold (see Definition 
3.1.3). 


Definition 5.1.5. The vector bundle in Proposition 5.1.4 is called the tangent 
bundle to M and is denoted by TM. 


There is an inherent difficulty in visualizing the tangent bundle, and more gen- 
erally any vector bundle, to a manifold. Consider the tangent bundle to a circle. 
The circle S! is a one-dimensional manifold that we typically visualize as the unit 
circle as a subset of R?. Viewing the tangent spaces to the circle as subspaces of R? 
or even as the geometric tangent lines to S! at p, we should view the union aF TpM 
as a subset of R?. This is not what is meant by the definition of TM. The spaces 
T,M and T,M do not intersect if p # q. At best, if M is an embedded submani- 
fold of R”, then T,M may be viewed as a subspace of a different Euclidean space. 
Thus, for example, for the circle S', the tangent bundle T(S!') can be realized as an 
embedded submanifold of R*. In fact, we can parametrize T(S*) by 


Y(u,t) = (cos u, sin u, —tsin u, t cos u) for (u, t) € [0,27] x R. 


Therefore, even in this simple example, visualizing the tangent bundle requires more 
than three dimensions. Nonetheless, it is not uncommon to illustrate the tangent 
bundle over a manifold by a picture akin to Figure 5.1. 


Proposition 5.1.6. If M™ is a differentiable manifold of dimension m, and V is 
a real vector space of dimension n, then a differentiable vector bundle of fiber V 
over M is a differentiable manifold of dimension m+n. 


Proof. Let E be a vector bundle of fiber V over a differentiable manifold M with 
the data described in Definition 5.1.1. Since V is isomorphic to R”, without loss of 
generality, let us take V = R”. On each open set 7~!(U,) in the vector bundle EF, 
consider the function Ty defined by the composition 


-1 . 
To 21 1(Uy) 2 Uy xR” 22239, RO x Re = RMN, 
where by ¢q Xx id we mean the function (¢q x id)(p,v) = (¢a(p),v). We prove 
that the collection of functions {(7~1(Uq),7)} is an atlas that equips E with the 
structure of a differentiable manifold. 
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Figure 5.1: Intuitive picture for a tangent bundle. 


Since 7 is continuous, 7~1(U,) is open and, by construction, the collection of 
open sets 7~1(U,) cover E. The function dg : Ua — Va is a homeomorphism, where 
Vj is an open subset of R™. Therefore, it is easy to check that for each a € J, 


dba X id: Ug x R" > Va x R” 


is a homeomorphism. Thus, since wg is a homeomorphism by definition, then 


Ta = (ba X id) 0 Wz! is also a homeomorphism. 


Let (y,v) € ¢g(Ua Ug) x R”, and let (p,v) = (dg x id)~*(y, v) so that (p,v) 
is in the domain of the trivialization for yg. Then we calculate that 
(a9 75 1)(W40) = (Ga X id) 0-8 0 Wg 0 (bp! x id))(y, 0) 
= (ba 0 $5 (y); Paa(p)v) 
because wz! o Ws (p,v) = (p, 9ae(p)v) by definition of a vector bundle. 
At this stage, we must use the fact that 0,, is a differentiable map between 
the differentiable manifolds M and GL(R”). Since GL(R”) inherits its manifold 
structure as an embedded submanifold of R” the following quantities exist as 


n X n matrices: , 
O(Ba, © or ) 
Oy? 


To simplify notations, we set F = 0,30 ¢,'. Then a simple calculation for the 
B®’ ?B 


for l<i<m. 


function T, 073 as a function from R™*” into itself gives the following differential 
as a block matrix: 


[d(Ga © oa )yl | 0 


OF OF 
ay? By” | ae 


[d(Tq © cy Gel = 
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Furthermore, each of the entries in the above matrix is continuous. This shows 
that all the transition functions Tq o Tg ' are of class C!, establishing that the 
differentiable vector bundle is indeed a differentiable manifold. 


It is not hard to see that by adapting the above proof, we can also show that 
a C* (respectively, smooth) vector bundle is a C* (respectively, smooth) manifold. 
However, we point out the following consequence for tangent bundles to a manifold. 


Corollary 5.1.7. If M is a manifold of class C® and dimension m, then TM 
is a manifold of class Ck-! and dimension 2m. Furthermore, if M is a smooth 
manifold, then TM is a smooth as well. 


Proof. This follows from the proof of the above proposition and the fact that the 
linear transformation 64 is (Ot! /0x"), where (Z/) are the coordinates with respect 
to @g and («") are the coordinates with respect to ¢,. Therefore, in order for the 
functions 0,8 to be of class C’, the transition functions ¢g o ¢,! must be of class 
co, 

The second claim of the corollary follows immediately. 


Example 5.1.8 (Tangent Bundle of R”). As we saw in Example 3.3.7, the tangent 
plane to any point p in R” is again R”. However, we can now make the stronger 
claim that the tangent bundle of R” is T(R”) = R” x R”. We can sce this from the 
fact that R” is a manifold that can be equipped with an atlas of just one coordinate 
chart. Then, from Definition 5.1.1, there is only one trivialization map. Thus, the 
tangent bundle is a trivial bundle. 


Chapter 4 introduced various constructions associated to a vector space V, 
namely the dual V*, the space V®? @ V*®4, the symmetric product Sym* V, and 
the alternating product A‘ V. Also, if we are given a vector space W of dimension 
n, the direct product V 6 W and the tensor product V ® W are new vector spaces. 
In each case, if V and W are equipped with bases, there exist natural bases on the 
new vector spaces. 

Constructions on vector spaces carry over to vector bundles over a differentiable 
manifold M in the following way. Let € be a vector bundle over M with fiber V, 
and let 7 be a vector bundle over M with fiber W. It is possible to construct the 
following vector bundles over M in such a way that their bundle data are compatible 
with the data for € and 7 and the properties of the associated fiber: 


e The dual bundle €*. The fiber is the vector space V~. 


e The direct sum € 67. The fiber is the vector space V 6 W. The direct sum 
is also called the Whitney sum of two vector bundles. 


e The tensor product € ® 1. The fiber is the vector space V © W. 


e The symmetric product Sym* € for some positive integer k. The fiber is the 
vector space Sym* V, 


5.1. Vector Bundles on Manifolds 


187 


e The alternating product Avé for some positive integer k. The fiber is the 
vector space /\* V. 


Each of the above situations requires careful construction and proof that they 
are in fact vector bundles over M. We omit the details here but refer the reader to 
Chapter 3 in [40] for a careful discussion of how to get new vector bundles from old 
ones. 

One of the first useful bundles constructed from the tangent bundle is the cotan- 
gent bundle, TM™*, the dual to the tangent bundle. Recall that if p< M™, U is an 
open neighborhood of p in M, and x: U > R™ is a coordinate chart for U, then 
the operators 

Oy.c40e = Ze sousgees 
Ox} p Ox™ p 
form the associated basis of T,M. The cobasis for the dual bundle T;,M* is denoted 
by 


dz}, dx”,..., dz™, (5.6) 
defined as the linear functions on T,M — R such that 
; PO : 1 ifi=yJ, 
dx'(0;) = dx’ | — == 5.7 
BH) (oa ) : ‘ if iF j. Pt) 


The dependence on the point p € M is understood by context. 


Example 5.1.9. Consider a regular surface M in R°. M is an embedded two- 
dimensional submanifold of R?. Consider the bundle TM* @ TM* over M. Via 
a comment after Proposition 4.5.2, we identify TM* ® TM™* as the vector bundle 
over M such that each fiber at a point p € M corresponds to the vector space of all 
bilinear forms on T,.M. 


The formalism of vector bundles over manifolds may initially appear unneces- 
sarily pedantic. However, since in general a manifold need not be given as a subset 
of an ambient Euclidean space, it is only in the context of the tangent bundle on a 
manifold that we can make sense of tangent vectors to M at various points p € M. 
We discussed how to obtain new bundles from old ones so that it would be possible 
to discuss other linear algebraic objects associated to the tangent bundle, such as 
bilinear forms on T'M, as in Example 5.1.9. 

The value for physics is that in order to study the motion of a particle or a 
system of particles that is not in R”, then the ambient space for this system would 
be a manifold. Without the structure of a differentiable manifold, we cannot talk 
about differentiability at all. However, on a differentiable manifold, any kind of 
differentiation will be given in reference to the tangent bundle. It is not hard 
to imagine the need to do physics on a sphere, say when studying global earth 
phenomenon but only looking at the surface of the earth. In some natural problems, 
the configuration space (the space in which the variables of interest exist) is not 
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a Euclidean space, and in this context, the equations of dynamics must take into 
account the fact that the ambient space is a manifold. Perhaps the most blatant 
examples of the need for manifolds come from cosmology, in which it is now well 
understood that our universe is not flat. Therefore, doing cosmological calculations 
(calculations on large portions of the universe) requires the manifold formalism. 


PROBLEMS 


5.1.1. 


5.1.2. 


5.1.3. 


5.1.4. 


5.1.5. 


5.1.6. 


5.1.7. 


Consider the unit sphere S? equipped with the oriented stereographic atlas {an, 75} 
described in Examples 3.7.3 and 3.1.4. Explicitly describe an atlas for the tangent 
bundle T(S?) as a manifold and write down the transition functions for this atlas. 


Normal Bundle. Consider a regular surface S$ in R*. At each point p € S, let N(p) 
be the set of all normal vectors. Explicitly show that the points in S, along with 
its normal vectors at corresponding points, form a vector bundle (in fact a line 
bundle). Suppose that for each coordinate patch U, parametrized by x (u,v) we 
define Wa : Vax R> na! (Ua) as 


Walp, t) =p A tXu(uo, vo) x Xy (uo, Vo); 
where p = XxX (uo, vo). Determine the functions 0gq between different trivialization 


maps. (This vector bundle is called the normal bundle.) 


Normal Bundle. Let M™ be a differentiable manifold embedded in R” where 
m <n. For all p € M, let Np be the orthogonal complement to T,M in R”. 
Prove that the disjoint union of all Np subspaces is a vector bundle over M. (This 
vector bundle is called the normal bundle to M and generalizes the situation in the 
previous exercise.) 

In the study of dynamics of a particle, one locates the position of a point in R® 
using its three coordinates. Therefore, the variable space is R?. Explain why the 
variable space for a general solid object (or system of particles rigidly attached to 
each other) is R* x SO(3). In particular, explain why we require six variables to 
completely describe the position of a solid object in R®. 


Provide appropriate details behind the construction of the Whitney sum of two 
vector bundles. 


Consider the real projective space M = RP”. We view RP” as the set of one- 
dimensional subspaces of R"*. Consider the set {(V,i@) € RP” x R"*1 | ae V}. 


(a) Show that this is a line bundle. 
(b) Show that this line bundle is not the trivial bundle. 


(This bundle is called the canonical line bundle on RP”.) 


Consider the following parametrization for a torus S' x S! as a subset of R® 
X(u,v) = ((2 + cos u) cos v, (2 + cos u) sin v, sin wu), 


for (u,v) € [0,27] x [0,27]. Using X, given the associated parametrization of the 
manifold T(S' x S') as a subset of R°. 
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5.2. Vector and Tensor Fields on Manifolds 
5.2.1 Vector and Tensor Fields 


Definition 5.2.1. Let € be a vector bundle over a manifold M with fiber V, with 
projection 7: E(€) > M. A global section of € is a continuous map s: M > E(é) 
such that 70 s = idy, the identity function on M. The set of all global sections 
is denoted by I'(€). Given an open set U C M, we call a local section over U a 
continuous map s: U — E(€) such that 70s = idy. The set of all local sections 
on U is denoted by ['(U;6&). 


Note that sections of a vector bundle (whether local or global) can be added 
or multiplied by a scalar in the following sense. If s1,s2 € T(U;&), then for each 
p€U CM, s1(p) and s9(p) are vectors in the same fiber 7~!(p). Consequently, 
for any scalars a,b € R, the linear combination as,(p) + bs2(p) is well defined as an 
element in 7~1(p). 


Definition 5.2.2. Let WM be a differentiable manifold. A global section of TM is 
called a vector field on M. In other words, a vector field associates to each p © M 
a vector X(p) (also denoted by X,) in T,M. The set of all vector fields on M is 
denoted by X(M). A vector field X is said to be of class C* if X: M+ TM isa 
map of class C* between manifolds. 


We point out that if U is a open subset of a smooth manifold M, then a vector 
field X € X(U) is a derivation on C*(U,R). 


Example 5.2.3 (Metric Tensor of a Surface). Let M be a regular surface in R°. In 
the local theory of regular surfaces in R°, the first fundamental form (alternatively 
called the metric tensor) is the bilinear product g = [,(-,-) on T,M obtained as the 
restriction of the dot product in R? to the tangent plane TM. Therefore, with the 
formalism of vector bundles and using Example 5.1.9, the first fundamental form 
is a section of TM* @ TM*. In fact, since J,(-,-) is symmetric and defined for all 
p, independent of any particular basis on TM* ® T'M™, then the metric tensor is a 
global section of Sym? TM*. 

Let p be a point of M, and let U be a coordinate neighborhood of p with 
coordinates (a', x”). This coordinate system defines the basis 


dz! ® dx’, dx! ® dx”, dx? @ dz, dx? ® dz? 


on T,M* ®T,M*. Furthermore, each basis vector is a local section in T(U,TM* ® 
TM*). The coefficient functions g;; of the metric tensor are functions such that, as 
an element of [(U,TM* ® TM*), the metric tensor can be written as 


g = guidx' @ dx’ + gigdx! @ dx? + goidx? ® dx! + goodx? ® dx”. 


Definition 5.2.4. A tensor field of type (r,s) is a global section of the vector 
bundle TM®" @ TM*®%. The index r is called the contravariant index, while the 
index s is called the covariant index. 
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Let A be a tensor field of type (r,s) on a manifold M". Over a coordinate patch 


U of M with coordinates (x!, x2,...,2”), we write the components of A as A’t’?2”*r , 
7 5) ? ? J1J2° Is 


This means that A’2’?""’r are n"+* functions U > R such that with respect to the 
J1J2°*"Is 
basis on T, M®" ® T,M*®*, 


aased 0 : 
= Allie tr Banat j jo... je 
A= Ads am ae Cae dx” ® dx?? @--- @ dri. 
If U’ is another coordinate patch on M with coordinates (%',77,...,Z”"), we 
ki kg---Kr 


label the components of A in reference to this system as Ae ine. Again, these 
components are a collection of n’** functions U’ + R. On the intersection U NU’, 
both sets of components describe the same tensor but in reference to different bases. 
By Proposition 4.5.2, the components of A change according to 


Akika--kr Oa dE 0a Oe on ont Airiaviy 5.8 
hilavls ~~ Oatr Ogi2  Agir ORL Fle OBls “Iid2~ds" (5.8) 


As anticipated by the comments in Section 4.5.4, we have generalized the notion 
of tensors in R” to tensor fields over a manifold M. 

As a point of terminology, a vector field on a manifold is a tensor field of type 
(1,0). In contrast, a tensor field of type (0,1) is often called a covariant vector field, 
or shorter, a covector field. We call any tensor field of type (r,0) a contravariant 
tensor field, and any tensor field of type (0, s) is called a covariant tensor field. 


5.2.2 Operations of Tensor Fields 


Referring to the multilinear algebra developed in Section 4.4, there exist a number 
of natural operations on tensor fields. Let M be a differentiable manifold. Let A 
be a tensor field of type (r,s) and let B be a tensor field of type (k, 2) on M. We 
define the tensor product of A and B as the tensor field of type (r +k, s+) defined 
by 

(A® B)p =A, ® By for pe M. 


In this sense, the ® operator is a bilinear transformation 
® :T(TM®” @ TM*®*) x T(TM® @ TM*®) 3 T(TM E+) @ TM*8Et9), 


Let X € X(M) be a differentiable vector field on a manifold M and let A € 
I'(1M®" ®TM*®*) bea tensor field on M. Then the contraction operation between 
X and A is the linear transformation 


X(M) @T(TM® @ TM*®*) > T(TM®" @ TM*88-))) 


defined by contraction on the first covariant index of A. This is also denoted by 
ix A, where we view ix as a linear transformation [(TM®" @TM*®*) + T(TM®"@ 
TMS e—1)), 
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5.2.3. Push-Forwards of Vector Fields 


We remind the reader that a tangent vector field X on a manifold M is such 
that, at each point p € M, we have a differential operator on real-valued functions 
X,:C'(M,R) > R. Soa vector field is a function of both p € M and f € C'(M,R). 
If we apply X to the function f first, then we can think of a vector field as a mapping 
X : C'(M,R) > C°(M,R) via the identification 


Xf = (p> X,(f)). (5.9) 


Over a coordinate chart U of M with coordinate system (a!,2?,...,2"), we write 


so the real-valued function X f on M is defined by 


(xin) =o xmss 


Pp 


Recall from the definition of the differential, if F : M — N is a differentiable 
map and X is a vector field on M, then for each point p € M we define the vector 
FX) = dF,(Xp) € Tr(p)N as the push-forward of X by F’. Unfortunately, this does 
not in general define a vector field on N. If F is not surjective, there is no natural 
way to define a vector field associated to X on N—F'(M). (Even proposing to define 
the push-forward vector field to 0 on N—F'(M) would not ensure a continuous vector 
field on N.) Furthermore, if F' is not injective and if p; and pg are preimages of 
a point q € F(M), then nothing guarantees that F.(Xp,) = F.(Xp,). Thus, the 
push-forward is not well defined in this case either. However, we can make the 
following definition. 


Definition 5.2.5. Let M and N be differentiable manifolds, let F : M— N bea 
differentiable map, let X be a vector field on M, and let Y be a vector field on N. 
We say that X and Y are F-related if F,(Xp) = Yep) for all pe M. 


With this terminology, the above comments can be rephrased to say that if X 
is a vector field on M and F': M > N is a differentiable map, then there does not 
necessarily exist a vector field on N that is F-related to X. 


Proposition 5.2.6. Let F: M— N be a differentiable map between differentiable 
manifolds. Let X € X(M) and Y € X(N). The vector field X is F-related to Y if 
and only if for every open subset U of N and every function f € C'(U,R) we have 


X(foF)=(Yf)oF. 
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Proof. For any p € M and any f € C1(U,R), where U is a neighborhood of F(p), 
by Proposition 3.4.2 we have 


X(f 0 F)(p) = Xp(f 0 F) = dFp(Xp)(f) = Fu(Xp)(f)- 
On the other hand, 
(Vf) oF)(p) = (YS)(F(p)) = Yroyf- 


Thus, X(f oF) = (Yf)o° F is true for all f if and only if F.(Xp) = Yp,p) for all 
p€M. The proposition follows. 


Though in general vector fields cannot be pushed forward via a differentiable 
map, we show one particular case in which push-forwards for vector fields exist. 


Proposition 5.2.7. Let X € X(M) be a vector field, and let F: M > N be a 
diffeomorphism. There exists a unique vector field Y € X(N) that is F-related to 
X. Furthermore, if X is of class C* and F is a diffeomorphism of class C*, then 
soi Y. 


Proof. In order for X and Y to be F-related, we must have F, Xp = Yip). There- 
fore, we define Y, = F,(Xp-1(q)). Since F is a diffeomorphism, the association 
q ++ Yj is well defined. However, we must check this association is continuous 
before we can call it a vector field. 

If (x*) is a coordinate system on a neighborhood of p = F'~1(q) and if (y’) is a 
coordinate system on a neighborhood of q, then the coordinates of Yq are 


y= OFI i 0 
9 “Oat |p-1(q) 7 Gy Iq 


Finally, if F~' and X are of class C*, then by composition and product rule, the 
global section N + TN defined by ¢ + (q, Yz) is of class C*. 


Definition 5.2.8. If X € X(M) and F': M > N is a diffeomorphism, then the 
vector field Y in Proposition 5.2.7 is called the push-forward of X by F and is 
denoted by FX. 


5.2.4 Integral Curves and Flows 


As we promised in the introduction to this chapter, vector bundles on a manifold 
allow for the possibility of doing physics on a manifold. We begin to see this through 
the existence of trajectories in what we might view as a velocity vector field. 

Let 6 > 0, and let y : (—d,6) > M be a differentiable curve on M™. Recall that 
we must understand 7¥ as a differentiable function between manifolds. Let X be a 
vector field on M so that for each p € M, X;, is a tangent vector in T,M. Referring 
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to Example 3.4.3 for notation, the curve y is called a trajectory of X through p if 
7(0) = p and 


def d 
y(t) = % (5 ) = X21) (5.10) 


for all t € (—6,6). A trajectory is also called an integral curve of the vector field X 
because it solves the differential equation represented in (5.10). 

If  : U > R” is a coordinate patch of M around p, it is by definition a 
diffeomorphism with «(U). Hence, the push-forward x,(X) is a vector field on 
a(U). Call this G, so xX = G:a2(U) > R™. Call @: (—6,5) + x(U), the curve 
with c(t) = (xo y)(t). Then applying x, to (5.10) gives 


dé = 

wre (F],) = @Xaoyy + | = GA) 
Consequently, (5.10) is equivalent (locally) to an ordinary differential equation in 
R”™. Theorems of existence, uniqueness, and continuous dependence on initial con- 
ditions for ordinary differential equations carry over to the context of differentiable 
manifolds. (See for example Sections 7 and 35 in [3] for the classic results in this 
area.) Instead of proving the difficult theorems behind the following application to 
differentiable manifolds, we restate [52, Theorem 5, Chapter 5]. 


Theorem 5.2.9. Let M be a differentiable manifold of class C® with k > 2, and let 
X bea vector field on M of class C*. Letp € M. There exists an open neighborhood 
V CM of p and a positive real 6 > 0 such that there is a unique collection of 
diffeomorphisms yy: V — yi(V) for |t| < 6 with the following properties: 
1. Yo = idy, #.e., yo(q) =¢q for allqeV. 
2. p: (-6,6) x V > M, defined by y(t, q) = v:(q), is C*. 
3. If |s| < 6, |t] < 6, and |s+t| < 6, and both q,yi(q) € V, then gsii(q) = 
Ys ° pr(q)- 
4. IfqeV then 
on 
Ot 
in other words, for € small enough, the curve y : (—e,¢) > M defined by 
y(t) = y1(q) ts a trajectory of X, 1.e., satisfies y'(0) = Xq. 


= Xo(4,q); (5.11) 


Definition 5.2.10. The function » : (—d,6) x U > M is called the flow of X on 
M near p. 


Figure 5.2 depicts a vector field X on two-dimensional manifold M (embedded 
in R°). The black curve is a particular trajectory since every tangent vector to the 
curve at p (a point on the curve) is parallel to X,. To be precise, the shown curve 
is only the locus of the trajectory since the trajectory itself is a curve parametrized 
in such a way that the velocity vector at each point p is exactly Xp. 
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Figure 5.2: A vector field on a manifold. 


According to Definition 3.3.1, the last condition of Theorem 5.2.9 means that 
X,_ = D,. Thus, for all real-valued differentiable functions f on a neighborhood of 
q; 


(Xa) = Xalf) = Dy(1) = SFO |ecq = im LEMM =F 5.19) 


This equation simplifies many calculations, as we will soon see. 


Example 5.2.11. As an example of a flow using this differential geometry notation, 
consider the Euclidean plane M = R? and the vector field X = —y0, + 0,. Recall 
that the notation 0y/dt means y,(0/dt). Furthermore, y : (—6,6) x R? > R? and, 
with respect to the standard basis on the tangent plane of R? the matrix of y, is 


dy! dy! dy! 


[px] = jp be ji 
Ot Ox Oy 


The vector 0/0t is the first basis vector of that tangent space to a point of (—0d, 6) x 
M. Thus, (5.11) means in components with respect to the basis {0z, Oy}, 


Oy! 
eed ay 
Oe? | tog yp” Very” 
Ot 
This implies that 
O07 y! Oy? ‘i 


O02 ))2=~=~C«E CS 
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Using techniques to solve linear second order differential equations with constant 
coefficients, we see that y(t, x,y) = Acost + Bsint, where A and B are functions 
of x and y. Since y? = —Oy!/dt, we have y(t, x,y) = Asint — Bcost. However, 
the condition that yo(q) = q for all g € M means for this function that A = x and 
B=-y. Thus, we find that 


x sint + ycost 


p(t, t,y) = ( 


x cost — i) 


It is not hard to check that the trajectories for the flow of this vector field X consists 
of circles centered at the origin. 


PROBLEMS 
5.2.1. Let M =S? be the unit sphere and let U be the coordinate patch parametrized by 


=t7.1: 3 Tat) 2! oe TOL ip 2 
x (u,u’)=(cosu sinu’,sinu sinu~,cosu’) 
with (u',u?) € (0,27) x (0,7). Let X = cosu' sinu?d1 + sinu' sinu?d2, Y = Or, 
and Z = sin u?0, be vector fields over U. 


(a) Show that X and Z can be extended continuously to vector fields over all of 
M. 
(b) Show that Y cannot be extended continuously to a vector field in X(M). 
5.2.2. Let S be a regular surface in R®, and let X be a vector field on R?. For every 


p € S, define Y, as the orthogonal projection of X, onto T,S. Show that Y is a 
vector field on S. 


5.2.3. Suppose that M is the torus that has a dense coordinate patch parametrized by 
a '(u,v) = ((3 + cos v) cos u, (3 + cos v) sin u, sin v). 


Consider the vector field X = —20/0x + 20/0z € R®. In terms of the coordinates 
(u,v), calculate the vector field on M induced from X by orthogonal projection, 
as described in the previous exercise. 


5.2.4. Let M —S' x S' x S! be the 3-torus given as an embedded submanifold of R* by 
the parametrization 


a '(u,v,w) ((4 + (2+ cos u) cos v) cos w, (4 + (2 + cos u) cos v) sin w, (2 + cos u) sin v, sin u). 


Consider the radial vector field in R* given by Z = v8, + 278 + 2°83 + 2404. In 
terms of the coordinates (u,v, w), calculate the vector field X on M induced from 
Z by orthogonal projection of Z, onto T,M for allp ce M. 


5.2.5. Find a vector field on S? that vanishes at one point. Write down a formula expres- 
sion for this vector field in some coordinate patch of S?. 


196 


5. Analysis on Manifolds 


5.2.6. Referring to Problem 5.2.1, prove that the flow of Z is 


ut +tsinu? 
ge(ui, U2) = 2 


U 


5.2.7. Prove that TS! is diffeomorphic to S! x R. 


5.2.8. Let M be any differentiable manifold. Show that X(M) is an infinite-dimensional 
vector space. 


5.3. Lie Bracket and Lie Derivative 
5.3.1 Lie Bracket 


Consider a C?-manifold M and let X be a differentiable vector field over M. Recall 
that to call a vector field differentiable it means that the function X :M— TM is 
a differentiable map, or equivalently over any coordinate chart, the corresponding 
components X* of X are differentiable functions M — R. With the above interpre- 
tation of a vector field on M, we can talk about the functions Y(X f) or X(Y f), 
where X and Y are two differentiable vector fields on M. However, neither of the 
composition operations XY or Y X leads to another vector field. 
Letting X = X*0; and Y = Y/0;, for any function f € C*(M,R), we have 


X(Yf) = X(0'0;f) = X'*0;(Y!0; f) = X*O,YI0; f + X*Y0;(0;f). (5.13) 


Thus, we see that fH X(Y f)(p) is not a tangent vector to M at p since it involves 
a repeated differentiation of f. Nonetheless, we do have the following proposition. 


Proposition 5.3.1. Let M be a C?-manifold, and let X and Y be two vector fields 
of class C!. Then the operation f + (XY —YX)f is another vector field. 


Proof. Since the second derivatives of f are continuous, then the mixed partials 
with respect to the same variables, though ordered differently, are equal. By using 
Equation (5.13) twice, we find that 


(XY —YX)f = (X'0,Y'0;f) — (YO; X"0,f) = (X*0,¥7 — Y'd,X") ae 
Since for all 7 = 1,...,n the expressions in the above parentheses are continuous 


real-valued functions on M, then (XY — YX) has the structure of a vector field. 
We leave it as an exercise for the reader to show that the coordinates of (XY — 
YX) change contravariantly under a basis change in T,M. 


Definition 5.3.2. The vector field defined in Proposition 5.3.1 is called the Lie 
bracket of X and Y and is denoted by [X,Y] = XY —YX. If X and Y are of class 
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C” with n > 1, then [X,Y] is of class C"~!. Also, if X and Y are smooth vector 
fields, then so is [X,Y]. More precisely, 

[X,Y ]p(f) = Xp(Vf) — Ypo(Xf) (5.14) 
for all p € M and all f € C?(M,R). 


The proof of Proposition 5.3.1 shows that, in a coordinate neighborhood, if 
X = X'0; and Y = Y0;, the Lie bracket is 


[X,Y] = (X*0,Y2 — Y*0,X)6;. (5.15) 


This formula gives a coordinate-dependent definition of the Lie bracket, while (5.14) 
is a coordinate-free definition. 


Example 5.3.3. Consider the manifold R? — {(z,y,z)|z = 0}, and consider the 
two vector fields 


- = oy 2 i We 
iS 2 + (a+ ye. 
The one iterated derivation is 
AVS = (ug, + 355-0 a) Ge + +3) 
= ou5ge +5, tHE + pros + Ziyoa * 2 
“(2 + y) Si 3yz ae 3yz3 (a + not 


The expression Y X f has exactly the same second derivative expressions for f, and 
upon subtracting, we find that 


0 1 () 1 ) 
X,Y) =(xXY-YX)= | + Qyz? ny 
[X,Y] =( ) Vag t pets, (wy + = + 9y2"(@ +9) 5 
The Lie bracket has the following algebraic properties. 


Proposition 5.3.4. Let X, Y, and Z be differentiable vector fields on a differen- 
tiable manifold M. Let a,b € R, and let f and g be differentiable functions M —> R. 
Then the following hold: 


1. Anticommutativity: [Y, X] = —[X,Y]. 
2. Bilinearity: [aX +bY, Z| = a[X, Z]+D|Y, Z] and similarly for the second input 
to the bracket. 
3. Jacobi identity: [[X,Y], Z] + [[Y, Z], X] + [[Z, X], Y] =0. 
4. [fX,9Y] = folX,Y] + fX(g)¥ — GV (f)X. 
Proof. (The proofs of these facts are straightforward and are left as exercises for 
the reader.) 
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5.3.2 Lie Derivative of Vector Fields 


A key concept in analysis is the ability to take derivatives. We have studied differ- 
ential operators of functions f : M — R. However, a central theme of analysis on 
manifolds pertains to defining operators on vector fields and tensor fields that be- 
have like derivatives. Though we begin this theme here, we revisit it in Section 5.6 
and in Section 6.2. 

Suppose that f is a differentiable function on some open set U of R”, let pe U 
and let V be a vector in R”. In calculus, assuming that v is a unit vector, we define 
the directional derivative of f along v at p by 


# (Fp +tv))|,_9 = lim tf (@ + he) — f()), 


dt h>0h 
We can extend this concept to vector fields in a natural way. If X is a differentiable 
vector field on U, then the directional derivative of X at p along v is 


d eee 
Hr +tv))|,_» = - 7% + hv) — X(p)), 

For each vector V, this defines a new vector field on R”. If the Cartesian coordinate 
functions of X are X*, then the ith component of the directional derivative is 


axi 
Nino = Gor 


vs, 
p 


d 2 
qk +tv!,...,p" + tu” 


This notion does not easily generalize to manifolds because the Euclidean space 
is both a manifold and a vector space. Furthermore, identifying T,R” with IR” for 
all p allows the expression X (p+tv) — X(p) to have meaning. In a general manifold, 
Xp+ty and X, are in different tangent spaces, so taking their difference makes no 
sense. However, using the push-forward of a “backwards flow” we can propose an 
operation that makes sense. 


Definition 5.3.5. Let M be a C?-manifold and let X € X(M) be a C? vector field 
on M and let vy; be the flow of X on M. 


1. If f € C1(M,R), we define the Lie derivative of f by X as the function 
Lxf © Xf. 

2. If Y € X(M) is another differentiable vector field, we define the Lie derivative 
of Y by X as the vector field £xY with 


(ExV}p 2 S(o)eVonter)leco = him +(((p-n)e¥)p—¥p)- (8.16) 


Figure 5.3 illustrates the definition of the Lie derivative by depicting the trajec- 
tory of X through p as well as the trajectories of Y through p and y;(p). 

By Theorem 5.2.9, the flow exists for small t 4 0 and (y_+)+Yy,(p) is a vector 
in T,M so the difference of vectors in (5.16) is well-defined. To clarify notation, 
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Y, 


Ph(p) 


(p-n)* (You (ry) 
ee eae 


pn(p) 


p(t, p) 


Figure 5.3: Illustration of (5.16). 


(p_t)«(Yy,(p)) is the differential of y_; applied to the tangent vector Y,,/p), while 
((y_t)«Y)p is the push-forward of the vector field Y by the diffeomorphism y_z, 
then evaluated at p. The above definition claimed that £xY is a vector field; we 
need to prove it. 


Proposition 5.3.6. Let M be a manifold of class C® and let X be a vector field of 
class C* on M and Y a vector field of class C’ on M. Then LxY is a vector field 
on M of class C", where r = min(k —2,@—1). If the manifold and vector fields are 
smooth, then so is LxY. 


Proof. Let y be the flow of X on M. By Theorem 5.2.9(2), the flow y : (—6,6) x 
M — M is of class C*. Let p € M and let (U,x) be a coordinate chart of a 
neighborhood of p. There exists a domain (—¢,¢) x Up with p € Up such that y 
maps (—e,¢€) x Up into U. Write y' = xo y as the component functions of the flow 
in U. The components of the matrix of the differential (y_+). : Ty,(p) M —- TpM 
are 

OP PED). (5.17) 

Oxs 


Consequently, if Y = YI0; over U, then 


0 


Ox? p ; 


= J 
(p-t)*Youn) = Bas Y’ (y(t, p)) (5.18) 
We obtain £xY by taking the derivative of the component functions in (5.18) with 
respect to ¢ and setting t = 0. The functions Y? are of class C*, y(t, p) of class C* 
and O0y'/dx of class C*~!, so taking the derivative with respect to t decreases the 
differentiability class by 1. The result follows. 


Example 5.3.7. As a specific example of using (5.16), let us revisit Example 5.2.11, 
where M = R? and X = —y0, + x0,. In standard coordinates with respect to these 


200 


5. Analysis on Manifolds 


bases, 


Cone xcost + ysint ed VOD cost sint 
Pts) = \ _osint + ycost ne ESE Neen’ Coat) * 


This example is particularly simple because (y_,). is independent of (x,y). Hence, 
the matrix in (5.17) is the same as (y_;), expressed here. Then for a vector field 
Y, with component functions Y'(z, y) and Y?(z, y), using (5.18) we get 


(o_-2)e(Y, ee cost sint\ /Y'(xcost — ysint, xsint + ycost) 
P-t)«\"ee(@)) ~ \_sint cost) \Y2(xcost — ysint,xsint + y cost) } ° 


There are a few other natural ways we can bring two nearby vectors into the 
same tangent space in order to perform a limiting difference. However, they turn 
out to be equal to that given in (5.16). We leave it as an exercise to the reader to 
prove the following proposition. 


Proposition 5.3.8. If X and Y are differentiable vector fields on a C?-manifold, 
then 
ee aA Gl 
(LxY )p = lim re (Yp — (Yn)*(Yo_n@)) = lim h (Yon) — (Yn)«(¥)) - 


We usually understand an operator on functions to be a differential operator if 
it is linear and satisfies an appropriate Leibniz rule (product rule). The following 
proposition shows this is the case for the Lie derivative. 


Proposition 5.3.9. Let X, Y, and Z be vector fields on a differentiable manifold 
M and let f: MR be a differentiable function. Then 

1 LY(Y+Z)=LYYVY +LxZ; 

2 LX(fY) = (Lxf)¥ + f(LxY). 
Proof. Part (1) follows immediately from the linearity properties of (y_+). and of 


the derivative operator d/dt. 
For Part (2), if y is the flow of X on M, then 


Lx(f¥) = lim = ((g-n)e(F¥ onto») ~ FY)0) 


L£x(fY) = lim 5 (f (en (P))(Y—n)« (Yon py) — f(Yn(p))(¥p)) + lim > (f(Yn(P))(Yp) — F(P)Yp) 


= (fim, #(en(0))) him 5 (e-1)e(%oncn) ~ Yo) + (Jim F(FCen(0)) = £00) ¥ 


= F(Pp)(LxY)p + (XF)(P)Yp- 
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The formula follows from the definition Ly f = Xf. 


With the identities of Proposition 5.3.9, we can calculate the coordinate-dependent 
formula for the Lie derivative, which leads to the following interesting result. 


Theorem 5.3.10. Let X and Y be C? vector fields on a C?-manifold M. Then 
LxY =[|X,Y]. 


Proof. Let p € M and let (U,x) be a coordinate chart on a neighborhood of p. 
Suppose that X = X‘0; and Y = Y“0; in coordinates over this chart and let ¢;(x) 
be the flow of X over U. 

We point out two facts about the flow of X. First, since limp_.9 yp, (x) = x for 
all x € U, then 


Ogi, i 
ie age man 
for all 7,7, so in particular, 
, Opp : 
tim, 7" (gp(x)) — 64 = 0. 
This leads to our second observation: we claim that 
Poet y ; Ox’ 
Ms (Freovey 7 5) mer vey 


To see this, note that by definition of differentiability in t, the component functions 

pila) satisfy 

dy; (2) 
dt 


t+tRi(t,x2) = 2° + X*(x)t + tR'(t, 2) 


yi(x) = a + 
t=0 


for remainder functions Rt : (—e,¢) x U > R such that lim;-49 R'(t, 2) = 0. Then 


Op’, b OX ORi 
Ox 


This gives 


h0h 


dy? i i 
me ( F-* (gy (2) — 5) = tim, & (—FE wonton — 055 2, onl) 
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We can now calculate the Lie derivative in components. By definition of differ- 
entiability near { = 0, we can write the component functions of Y,,(2), which we 


express for now by Y(y;(a)), as 
j j d yi j 
¥" (pe(x)) = ¥"(x) + (V7 (Ge(@)))| ot + 49" (t, 2), 
for some remainder functions $/(t, x) that satisfy lim,—,o S/(t, 2) = 0. Then 


YI (y:(x)) = VY? (x) + oa x ae (| at t+ tS) (t,x) 


YI (x) + —(ax)X*(x)t + tS4 (t,x). 


By (5.18) the components of (_¢).Yy,(p) are 


Fat teu) (20) + Feat oye +49%(Ep)) 


so the components of Lx Y are 


(Cx¥)'(p) = lim ¢ eS (a(B)¥ (0) + AES" (pale) eX") 
+h2Z (py (p))5(h,p) - 0) 
= Jim (; ee (en(o)¥i(0) - a) 
+ OF py) Xp) + SE ea) 
so by (5.19), 
(Cx¥)'(p) = kim ( (= (¢n(o)) - 5) V9 (p) + 5) 222 (p)X*(p) + 55°C.) 


Ox? oy’ 

———(p)Y! ~—-(p)X* 

Dai PY? (P) + ay (P)X"(P) 

where the last equality holds by (5.20). After replacing the summation variable k 
with j, we recover the component description of the Lie bracket given in (5.15). 


We mention this first corollary simply to reinterate the coordinate-dependent 


expression for the Lie derivative. 
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Figure 5.4: The curve paths defining c(t). 


Corollary 5.3.11. Let X and Y be vector fields on a C?-manifold M. Suppose 
that over a coordinate patch (U,x) of M we have X = X*0; and Y = Y;0; in 
components. Then the components of the Lie derivative are 


joi _ xt, 


aS dri Oxi 


Theorem 5.3.10 immediately leads to the following interesting corollary. 


Corollary 5.3.12. Let X, Y, and Z be differentiable vector fields on M and let 
f €C'(M,R). Then 

1. LYX =—-LxY. 

2. L£ix+yv)4Z H=LxyZ4+LlyZ. 


This is rather striking because of the following observation. The linearity rules 
of the Lie derivative £x as described in Proposition 5.3.9 followed easily from 
the linearity of (y_+). and a product rule. However, proving that Lyx4y)Z = 
LxZ+Ly Z from the definition would be intractable because there is no immediately 
obvious connection between the sum of two vector fields and their flows on the 
manifold. 

Theorem 5.3.10 implies a number of nonobvious properties for the Lie derivative, 
the proofs of which we leave as exercises for the reader. 


Proposition 5.3.13. Let X, Y, and Z be differentiable vector fields on M, and let 
F:M—-N be a diffeomorphism between manifolds. Then 


1. LxlY, Z] = [Lx Y,LxZ]. 
2. Lixy]Z =LylyZ—-LyLlxZ. 
3. F(LxXY) = Lex (FY). 


In Section 5.6, we will expand the definition of the Lie derivative to tensor fields 
of all types, and not just functions and tangent vector fields. 
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Besides the algebraic properties, the Lie bracket also carries a more geometric 
interpretation. The bracket [X,Y] measures an instantaneous path dependence 
between the integral curves of X and Y. To be more precise, for sufficiently small 
t € (—e,¢), consider the curve c(t) that 


e starts at a point c(0) = p; 

e follows the integral curve of X starting at p for time ¢; 

e starting from there, follows the integral curve of Y for time ¢; 
e then follows the integral curve of X backwards by time —t; 

e then follows the integral curve of Y backwards by time —t. 


See Figure 5.4. If y, is the flow for X and vy; is the flow for Y, then this curve 
c: (-e,e) > M is 


c(t) = p_2(p_e(ve(Ye(p)))). 


Two properties are obvious. If ¢ approaches 0, then c(t) approaches p. Also, if x 
is a system of coordinates on a patch U of M and if X = 0, and Y = 0s, then the 
above steps for the description of c(t) travel around a “square” with side t based at 
p, and thus c(t) is constant. Other properties are not so obvious, and we refer the 
reader to [52, Proposition 5.15, Theorem 5.16] for proofs. 


Proposition 5.3.14. Defining the curve c(t) as above, 
1. c(0) =0; 


2. if we define c’"(0) as the operator satisfying c’’(0)(f) = (f oc)"(0), then c’(0) 
is a derivation and hence an element of T,M; 


9. c(0) = 2X, Y]p. 


Consequently, from an intuitive perspective, the Lie bracket [X,Y] is a vector 
field that at p measures the second-order derivation of c(t) at p. Since the first 
derivative c’(0) is 0, then c’(0) = 2[X,Y], gives the direction of motion of c(t) out 
of p as a second-order approximation. 


PROBLEMS 


5.3.1. Let M = R?. Calculate the Lie bracket [X,Y] for each of the following pairs of 
vector fields: 


) a 0, 0 
y 


(a) X =a Ya, +5, 


(b) X = sin(x 4 ye cosa and ¥ =cosx siny 5. 


5.3.2. Let M = R°. Calculate the Lie bracket [X,Y] for each of the following pairs of 
vector fields: 
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— 2 a i — i 3 a i a 
(a) X wa, +25 and Y (ety )a, +93, 
() a) () ce) a) 
(b) X uaa tg, FS and Y ta tg, as 
= 2 O —1 0 = 0 
(c) X = In(a* + ae + tan (ty) 5- and Y = an 


5.3.3. Consider Exercise 5.2.1, calculate the components of [X, Z] over U. 


5.3.4. Prove that Equation 5.15 that comes out of Proposition 5.3.1 changes contravari- 
antly, as a vector, under a change of coordinates on T,M. 

5.3.5. Referring to Example 5.3.7, use Definition 5.3.5 to calculate £xY for an arbitrary 
vector field on R?. Calculate the Lie bracket [X,Y] directly. [Hint: They should 
be equal.] 

5.3.6. Prove Proposition 5.3.8. 

5.3.7. Prove Proposition 5.3.4. 

5.3.8. Let fF: M > N be a differentiable map. Let X1, X2 € X(M), and let %41, Yo € 
X(N). Suppose that X; is F-related to Y;. Prove that [X1, X2] is F-related to 
[Y1i, Yo]. 

5.3.9. Prove Proposition 5.3.13. 

5.3.10. Let X and Y be differentiable vector fields on a C” manifold M. Let y be the 


flow of X on M. Prove that £xY = 0 everywhere if and only if Y is invariant 
under the flow of X (ie., Yy,(p) = (Ye)*Yp)- 


5.4 Differential Forms 


We now consider a particular class of tensor fields called differential forms. As we 
will see, differential forms have many uses in geometry and in physics, in partic- 
ular for integration on manifolds. We introduced the linear algebra necessary for 
differential forms in Section 4.6. 

Though it would be possible to continue the discussion with manifolds and func- 
tions of class C”, we will restrict our attention to smooth manifolds for simplicity. 


5.4.1 Definitions 


Definition 5.4.1. Let MM” be a smooth manifold. A differential form w of type 
r on M (or more succinctly, r-form) is a smooth global section (tensor field) of 
A’ (TM*). 


Intuitively, for each p € M, we associate w, € /\"(T,M*) in such a way that 
W, varies smoothly with p. The tensor w, is an alternating r-multilinear function 
T,M®" — R. Hence, a differential form is a particular type of covariant tensor field 
and that 1-forms are simply covector fields. A differential form of type 0 is simply 
a smooth real-valued function on M. 
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Let U be a coordinate neighborhood of M with coordinates x = (x!,2?,..., 2”). 
Define Z(r,n) as the set of all increasing sequences of length r with values in 
{1,2,...,n}. For example, (2,3, 7) € Z(3, 7) because there are three elements in the 
sequence, they are listed in increasing order, and their values are in {1,2,..., 7}. 
By Proposition 4.6.18, over the coordinate patch U, an r-form w can be written in 


a unique way as 
w= ya ay dx! ; 
IEL(r,n) 
where each ay is a smooth function, and where we denote dx! = dx A --- \ dx” 
when I is the r-tuple I = (i1,...,i,). Recall that the symbol dx’ is defined in 
Equations (5.6) and (5.7) and that this wedge product is defined as the alternation 


dx®™ A--- Ada = A(dx® @ dx® @---@ da’). (5.21) 


Alternatively, in reference to a coordinate system, a differential form is a smooth 


pees 


functions satisfy 
W101) 5=2(2),...624(n) = sign() Wi, sis,...ip 
for any permutation o € S;. of the indices. 


Definition 5.4.2. If U is an open subset of M, we denote by 0"(U) the set of all 
differential forms of type r on U. 


We remark that, similar to Problem 5.2.8, for each r, the set Q"(U) is an infinite- 
dimensional vector space. In particular, if w,7 € Q"(U) and A € R, then w+ 7 € 
Q’(U) and Aw € N"(U), where by definition 


(w+m)p =p +m and (W)p=Awp in \TM*. 


Not only is each Q"(U) closed under scalar multiplication, but it is closed under 
multiplication by a smooth function. More precisely, for all smooth functions f : 
U +R, we have fw € Q"(U), where (fw), = f(p)w, for all p € U. 

Finally, similar to the alternating products of a fixed vector space, for w € O"(U) 
and 7 € 8(U), we define the exterior product w An € Q'*8(U) as the differential 
form defined by (w A 17)» = Wp A Np for all p € M. 


Example 5.4.3. Consider the sphere S?, and let U be the coordinate neighbor- 
hood with a system of coordinates x defined by the parametrization 2~1(u,v) = 
(cos usin v, sin usin v,cosv) defined on (0,27) x (0,7). Let 


w = (sin? v) du + (sin v cos v) dv 


n = cosusinv du + (sin ucosv — sin v) du 
be two 1-forms on S?. Remarking that du A du = dv A dv = 0, we calculate 


w A 7 = sin? v(sin ucosv — cosucosv — sinv)du A dv. 
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5.4.2. Exterior Differential 


Let f be asmooth real-valued function on a smooth manifold M, and let X € X(M) 
be a vector field. Viewing f :  — R as a differential map between manifolds, the 
differential df is such that, at each point p € M, it evaluates df,(X,) to a tangent 
vector in Tp) (IR). However, the tangent space T,,,)(IR) is equal to R, so dfp(Xp) 
is just a real number. Hence, df, € T,M*, and since all of the operations vary 
smoothly with p, then df € 0'(M). If x = (a',...,2”) is a coordinate system on 
an open set U C M, then in coordinates we have 


df = - dx’. 5.22 

if » agi tt (5.22) 

Since C°(U,R) = 2°(U), the differential d defines a linear transformation d : 
0°(U) + Q1(U). We now generalize this remark by the following definition. 


Definition 5.4.4. Let w = >>, a;dz' be a smooth differential r-form over U. The 
exterior differential of w is the (r + 1)-form written as dw and defined by 


d= S* (daz) Ada’. (5.23) 
IEL(r,n) 


Example 5.4.5. Revisiting Example 5.4.3, we calculate dw and dy. First, for dw 
we have 


dw = (d(sin? v)) A du + (d(sin v cos v)) A dv 
= (2sinvcosv dv) A du + ((cos uv — sin? v) dv) A du 
= (—2sinvcosv) du A dv. 
For dn, we calculate 
dn = (d(cos usinv)) A du + (d(sin ucos v — sin v)) A du 
= ((—sin usin v) du + (cos ucos v) dv) A du 
+ ((cos ucos v) du + (— sin usinv — cosv) dv) A du 
= (cos ucosv) dv A du + (cos ucosv) du A du = 0. 


The differential form 7 has the unexpected property that dj = 0. We will say that 
7 is a closed 1-form (see Definition 5.4.7). 


Proposition 5.4.6. Let M be a smooth manifold, and let U be an open subset of 
M. The exterior differential satisfies the following: 


1. For eachO <r <n-—1, the operator d: Q"(U) = Q'*1(U) is a linear map. 
2. IfwEe QU) and n € 08(U), then 


d(w A 7) = dw An+(-1)'w A dn. 
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3. For allw € O'(U), we have d(dw) = 0. 


Proof. For Part 1, set w = >, a; dx", where the summation is over all I € Z(r,n) 
and a’ are smooth real-valued functions on M. Then from Equation (5.23), each 
da! is a 1-form, so obviously the summation is over (r + 1)-forms. 


Now let 7 = 5>, 6; dx! be another r-form, and let A, € R. Then 


d(Aw + pn) = x d(Xaz + pbs) A da" 


I \j=1 
Oa is ROU ux 
= eer pe) I =. Aad q 
=\5S¢ aie, dx) | \ dx +p >> ae, dx? | A dx 
I j=l I j=l 
= Adw + pd. 


This proves linearity of d. 


For Part 2, again let w be as above, and let 7 € 0*(U) expressed as 7 = 
>>; dx’, where the summation in J runs over Z(s,n). By the linearity of the 
wedge product, we can write 


wAn= > S abs da’ A dx’. 
Io 


Note that for various combinations of J and J, the wedge products dx! A dx’ will 
cancel if J and J share any common indices. Then 


d(wAn)= S », bs eas it) A da! A dx? 


Tod \k= 
= S- (>: (Sees +05) is) A dz! A dx? 
TI oJ \k= = ie 


= (>: setts ae) A dz! A dx? 
J = 
+ (>: aro i) A da! A dx’. 


But by the properties of wedge products, dax*® A da’ \ dx? = (—1)"dx! A dx* A dx? 
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(see Proposition 4.6.21). Thus, 


- 65 dx® A da! A dx? 
Oxk 


Q 
© 
> 
= 
i 
ato 
Mae 


+( —— a da! A dax® \ dx? 


J 


HA role ‘nae! 


=dwAn+(-1)"wA dn. 


> 
Il 


To prove Part 3, we first show that d(df) = 0 for a smooth function f on M. 
We have 


d(df) = 4 (Sane g 
aoe, 


nm nm @? ; : 
J a 
2 ) Balan dx? \ dx 
t=1 9=1 
Of 
—*__ | dy? 
t oe) oe 


where we assume I = (1,72). However, since the function f is smooth, by Clairaut’s 
Theorem on mixed partials each component function is 0. Thus d(df) = 0. 
Now for any r-form w = >>, a; dx! we have 


d(dw) =d on d(ar) A is!) = © d(d(az) A dx") (by linearity) 


ii T 
=> (a ) A dx! — da; A d(dz")) (by part 2) 
ii 


where the last line follows because d(da;) = 0 and d(dz!) = 0 for all I. 


It is illuminating to compare the exterior differential to differential operators on 
vector fields in R”. We emphasize three particular cases. 
First, let f be a smooth real-valued function on R”. Then 


ee 
df= an. 
i=1 
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Thus, df has exactly the same components as the gradient, defined in multivariable 
calculus as 


grad f = Vf = (O:f, Oof,.--, nf). 


Therefore, in our presentation, the gradient of a function f is in fact a covector 
field, i.e., a vector field in TM* = (R”)*. 

In calculus courses, we do not distinguish between vectors and covectors, i.e., 
vectors in R” or in (R")*, since these are isomorphic as vector spaces. However, as 
we saw in (4.2) and Proposition 4.1.6, vector fields and covector fields have different 
transformational properties under changes of coordinates. Example 4.5.8 showed 
that the gradient of a function transforms covariantly, but it is also instructive to 
see how this plays out in common formulas. For example, the chain rule for paths 
states that if c(t) is a differentiable curve in R" and f : R” — R is differentiable, 
then 

ae vi = 
at (lt) = Vee) &(t)- 


However, from the perspective of multilinear algebra, we should understand the dot 
product in this context as the contraction map V*@V — R defined by A@UH A(V). 
Since by definition ¢’(t) is a tangent vector to R” at c(t), then we should view the 
gradient Vf as a covector in (R”)*. 

As a second illustration, consider (n — 1)-forms over R”. For each 1 < j <n, 
define the (n — 1)-forms 7 as 


np = (—-1)?-tdx A+++ Ada?! AN dxt*1 A+++ Ada”. (5.24) 


For each p € M, the set {nJ}"_, is a basis for yas Sd a So any (n — 1)-form w 
can be written as w = )0"_, aj’ for functions aj : M + R. Note that having the 


(—1)/~+ factor in the definition of 7/ leads to the identity 


0 ids 
dx’ Ani =4 7, 5 i (5.25) 
da \da* A-++Ada”, ift= J. 


Thus, for the differential of w, we have 


Hence, for the case of (n — 1)-forms, the exterior differential d operates like the 
divergence operator div = V- on a vector field (a1,...,a@n) in R”. 
In the case of R®, the exterior differential carries another point of significance. 
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Let w € 21(R3), and write w = S7i_, a; dx’. Then 


aus = > ae ae 


i=1 j=1 
= (S34 - xt) da} \ dx? + (S- st) da \ da? 
0a3 Oag 
(sa _ = dx? rN dx? 


_ Oaz _ Oa fa 0a, = 0az 2, Oaz  Oa1\ 3 
~ \ x2 Ax 4 Ox? = Ox — Ox! Oa? V4 


which is precisely the curl of the vector field (aj, a2, a3). 

It is particularly interesting to note that the property d(dw) in Proposition 5.4.6 
summarizes simultaneously the following two standard theorems in multivariable 
calculus: 


curl grad f = 0 ([55, Theorem 17.3]), 
div curl F = 0 ([55, Theorem 17.11]), 


where f : R” — R is a function of class C? and F' : R? - R? is a vector field of 
class C?. 

We point out that the forms 7 defined in (5.24) are instances of the Hodge star 
operator « which we discuss in Appendix C.3. The Hodge star operator exists in the 
general context of a vector space equipped with an inner product (a bilinear form 
that is symmetric and nondegenerate). In the above situation, we have V = R” 
and the inner product (, ) is the standard Euclidean dot product. Then according 
to Proposition C.3.3, we have 

n) = «dx. 


5.4.3. Closed and Exact Forms 


Definition 5.4.7. Let M be a smooth manifold. A differential form w € 2"(M) 
is called closed if dw = 0 and is called exact if there exist 7 € Q'~'(M) such that 
w = dn. 

Example 5.4.8. As an example, consider the explicit covector fields w and 7 on 


S? described in Examples 5.4.3 and 5.4.5. In Example 5.4.5 we showed that dn = 0, 
meaning that 7 is closed. If 


n = cosusinv du + (sin ucosv — sinv) du 


is exact, then there exists a 0-form, i.e., differentiable function, f : S? + R such 
that 7 = df. Thus 
of of 


— =cosusinv and = =sinucosv — sinv. 


Ou Ov 
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By integrating with respect to u, we must have f(u,v) = sinusinv + g(v). Then 
differentiating with respect to v, gives Of /Ov = sin ucosv+g'(v) = sin ucos v—sinv. 
Thus, g(v) = cosu+C for some constant C. Thus f(u,v) = sinusinv + cosu+C. 
A priori, this is only defined for (u,v) € (0,27) x (0,7). However, f(u,0) = 1 and 
f(u,7) = —1, regardless of u € R and for v € (0,7), we also have f(u + 27,v) = 
f(u,v). Hence, f extends continuously to a well-defined function on all of S?. (We 
also note that with respect to the typical embedding of S? in R*, the function f is 
equal to y+ z+ C restricted to S?. 
Our calculations show that 7 = df, so 7 is an exact. 


This shows that not just any pair of smooth functions a; (u,v) and a2(u, v) allow 
the 1-form 
w = ai(u,v) dut ae(u, v) du 


to extend over S? to create a smooth 1-form on S?. For example, not even udu, 
defined on the same coordinate chart described in the above example, extends con- 
tinuously to a 1-form on all of S?. This restriction shows that Q1(S?) is affected by 
the global geometry of S?. The principle behind this example is true in general: the 
vector spaces 2"(M), though infinite-dimensional, depend on the global structure 
of M. 

The identity d(dw) = 0 for any differential form means that every exact form is 
closed. The converse is not true in general, and it is precisely this fact that leads to 
profound results in topology. In the language of homology, the sequence of vector 
spaces and linear maps 


ON Oy Se en OF) 


satisfying the identity dod = 0 is called a complex. To distinguish between types, 
we often write d” for the differential d : Q"(M) > 0"t!(M). The fact that every 
exact form is closed can be restated once more by saying that Imd"’~! is a vector 
subspace of ker d”. The quotient vector space 


ker d” /Imd"~! = ker(d : Q"(M) > Q"t!(M))/Im(d : 0°~'(M) > 07(M)) 


is called the rth de Rham cohomology group of M, denoted Hiip(M). The de Rham 
cohomology groups are in fact global properties of the manifold M and are related 
to profound topological invariants of M. This topic exceeds the scope of this book, 
but we wish to point out two ways in which one can glimpse why the groups Hjp(M/) 
are global properties of M. 

In Example 5.4.8 we observed that defining a form on all of S? carries some 
restrictions. Hence, the space of functions carries information about the global 
structure of M. Similarly, Problem 5.4.13 gives an example of a 1-form on R? — 
{(0,0)} that is closed but not exact. 

As a second example, we determine H9,(M) for any manifold. Of course, d~' 
does not exist explicitly so we set, by convention, Q~'(M) = 0, i.e., the zero- 
dimensional vector space. Then Imd~! = {0} is the trivial subspace in 0°(M/). 
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Furthermore, since 2°(M) is the space of all smooth real-valued functions on M, 
the Oth cohomology group is 


Ho n(M) = ker(d: C@(M) > 0'(M))/{0} = ker(d : C?(M) > 01(M)), 


namely, the subspace of all smooth functions on M whose differentials are 0. In 
other words, H9,,(M) is the space of all functions that are constant on each con- 
nected component of M. Thus, H9,,(M) = R*, where £ is the number of connected 
components of M, a global property. 


5.4.4 


Algebra of Differential Forms 


We conclude this section with a brief comment about the algebra of differential 
forms. Not unlike the tensor algebra or the alternating algebra over a vector space 
V defined in Section 4.7, we define the algebra of smooth differential forms over a 
smooth m-dimensional manifold M as 


0°(M) = a O*(M) = C@(M,R)@01(M) @---62"(M), 
k=0 


equipped with the exterior product A as the bilinear product. 


PROBLEMS 


5.4.1. 


5.4.2. 


5.4.3. 


5.4.4. 


Let M be a smooth manifold. Let w € 2"(M) be a nonzero r-form. Characterize 
the forms 7 € 2°(M) such that w A 7 = 0. 


Let M = R?°. Find the exterior differential of the following: 
(a) edyAdz+ydzAdzr4+zdzr Ady. 
(b) vy?z? dx + ysin(xz) dz. 
dx \ dy + «dy A dz 
(c) 
xe? +y2+ 2241 


Let M =R”. Letw = a2! da'+---+a" dx” and n = «7 da'+---+a" dx"~+a1 da”. 
(a) Calculate dw and dn. 
(b) Calculate w A 7 and d(w A 7). 


(c) Calculate the exterior differential of a'n' + 277? +---+a"n", where the 
forms 7’ are defined as in Equation (5.24). 


Let M = S' x S' be the torus in R® that has a coordinate neighborhood (U, x) 
that can be parametrized by 


z(u,v) ( (3 + cos u) cos v, (3 + cos u) sinv, sin u) for (u,v) € (0, Qn)”. 


Consider the two differential forms w and 7, given over U by w = cos(u + v) du + 
2sin? udv and yn = 3sin? vdu — 4dv. 
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5.4.5. 


5.4.6. 


5.4.7. 


5.4.8. 


(a) Show why w and 7 extend to differential forms over the whole torus. 
(b) Calculate w Aw and w A 7. 
(c) Calculate dw and dn. 
Consider the manifold RP? with the standard atlas described in Example 3.1.6. 


Consider also the 1-form that is described in coordinates over Up as w = x! da! + 
x? (2°)? dx? + ax? dz’. 


(a) Write down a coordinate expression for w in U1, U2, and U3. 
(b) 
(c) 


Calculate dw and w Aw in coordinates over Up. 

Calculate dw in coordinates over U; and show explicitly that the coordinates 
change as expected over Up N Ui. 

Set w = a’ a? dx? + (a? + 3a*x°) dx? + ((x?)? + (w3)?) dax® as a 1-form over R°. 
Calculate dw, w A dw,wAw, and dw A dw Aw. 


Consider the spacetime variables (#°, x’, 2”, xz?) = (ct, x, y, z) in R'*? and consider 
the two 2-forms a and ( defined by 


3 3 3 3 
a=—-) > E,dx° Adz'+ 5° By and B=) > Bide Adz + 5_ Ejn’, 
i=l j=l 


i=1 j=l 


where the forms 7’ are the 2-forms defined in Equation (5.24) over the space 
variables, i.e., 7! = dx? A dx®, n? = —dax' A dx®, and 7? = dx! A da?. 


(a) Writing B= (Ff, E2, E3) and B= (Bi, Bz, Bz) as time-dependent vector 
fields in R?, show that the source-free Maxwell’s equations 


ee ee V-E=0, 
c Ot 

v4 pate V-B=0, 
c Ot 


can be expressed in the form 
da=0 and d§B=0. 


(b) If we write the 1-form \ = ddx® + A; dx! + Ag dx? + A3dz®, show that 
dX = a if and only if 


Boeveet ss. <g- Beved 
c Ot 
In the theory of differential equations, if A(z,y) and B(x,y) are functions of x 
and y, an integrating factor for an expression of the form M ou + N is a function 
I(a,y) such that 


dy d 
I A —+B =—F 
(ew) (Aen + Ben) = Pew) 
for some function F(a, y). If M is a smooth manifold and w € 2'(M), we call an 


integrating factor of w a smooth function f that is nowhere 0 on M and such that 
fw is exact. Prove that if such a function f exists, then w A dw = 0. 
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5.4.9. 


5.4.10. 


5.4.11. 


5.4.12. 


5.4.13. 


5.4.14. 


5.4.15. 


5.4.16. 


Let w = (14+ ayjery dx + Qa? yer dy be a 1-form on R?. Show that dw = 0. 
Then find a function f : R? > R such that w = df. 
Let w = yzdx A dz + (—y4+ xz) dy A dz be a 2-form on R?. Show that dw = 0. 
Then find a 1-form 2 such that w = di. 

Suppose that w € 2'(M) for some smooth manifold M. Suppose that over each 
coordinate chart of M, if we write w in components as w = w; dz’ and if the 
component functions have the property that 0jw; = Ojw,;, then w is a closed form. 


Let w and 7 be forms on a smooth manifold M. 


(a) Show that if w and 7 are closed, then so is w A 7. 
(b) Show that if w and 7 are exact, then so is w A 7. 


Consider the manifold M = R? — {(0,0)} with the structure inherited from R? 
and let ” a 

Ogee Peg 
Prove that w is closed but not exact. [Note: In this case, there does exist a 
differentiable function 7 such that dy = w on {(x,y) € R? |x > 0} but not on all 


of M. This particular form w shows that dim H},(M) > 11] 


Let M be a manifold of dimension m > 4. Let w be a 2-form on M, and let {a, 3} 
be a set of linearly independent 1-forms. Show that 


whNahB=0 
if and only if there exist 1-forms \ and 7 such that 
w=AAatnAB. 
Consider the manifold GL, (R) of invertible matrices, and consider the function 
det : GL, (R) > R as a function between manifolds. 
(a) Prove that for all X € GL,(R), the tangent space is TxGL,(R) & R”"*", 


the space of n xX n matrices. 

(b) Writing the entries of a matrix X € GL,(R) as X = (2x'), prove that 
O det 

Ox’ 


(X) = (det X)(X1)i. 


(c) Prove that the differential of the determinant map can be written as 
d(det) x (A) = (det X) Tr(X~"A), 
where Tr M = 0, mi is the trace of the matrix. 


This exercise presents the interior product of k-forms on a smooth manifold M. 
The interior product of a k-form with k > 1 is defined as the contraction of 
the form with a vector field. More precisely, if X is vector field of M we define 
ix : O*(M) > Q*-1(M) such that for all p€ M 


(ixw)p(v1, v2, Zee ,Uk—1) = Wp(Xp, U1, te 2 URN) 


for all v1,...,ve—-1 € TpM. Prove the following properties of the interior product. 
Let X be a smooth vector field over M. Suppose that over a coordinate patch 
(U, x), the vector field is written in components as X‘0;. 
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(a) Prove that ix (dx! A dx? Ada?) = X3da? Ada? — X°dx! Ada? + Xtdx? \dz*. 
[Hint: Refer to (5.21).] 


(b) Suppose I = (#1, 72,...,%¢r) with i1 < ig <--- <i». Prove that 


T 
ate 


ix(dx’) = S°(-1)9 1 X*i da"! A--- Adati A--- Adz", 
j=l 


where the dai means to remove that term. 


(c) Ifq@is an r-form and £ an s-form, then ix (aA) = (txa)AB+(—1)’aA(ixB). 
(Hint: Using coordinates, first prove this result on a = dz? and 8 = dx’ 
with I € Z(r,n) and J € T(s,n).] 


(d) If Y is another vector field, then txiyw = —iyixw. 


5.5 Pull-Backs of Covariant Tensor Fields 


In this section we define the notion of a pull-back of a covariant tensor fields by 
a smooth function between manifolds. Though the construction of pull-backs is 
interesting in its own right, in subsequent sections we will see a few applications of 
the pull-back, including how to integrate differential forms over a manifold. 


Definition 5.5.1. Let f : M™ — N” be a smooth map between two smooth 
manifolds, and let a € '(T.N*®*) be a covariant tensor field on N. Define the pull- 
back of a by f, written f*a, by the multilinear function on T,,M that is defined 
by 


(f*a)p(v1, V2,.--5 Ur) = Of (py (dfp(v1), Ufp(ve),..-, dfp(vr)), (5.26) 


where v; are tangent vectors in T,M. 


According to this definition, (f*a), € T,M*®*, so f*a is a global section from 
M into the vector bundle TM*®*. Furthermore, it is not hard to see that if w is a 
differential form in 0*(NV), then (f*w), is also an alternating multilinear function 
on T,M, so f*w is a differential form in 0°(M). 

The above definition is coordinate-free. We now work to express the pull-back 
of a covariant tensor field in terms of coordinates. Let x be a local coordinate 
system on M and y is a coordinate system on N. Suppose that over a coordinate 
neighborhood (V, y) of N, the covariant tensor field a is written as 


= Minin, dy"! @ dy? @--- @ dy’, 


where Qj,;,-..i, is a smooth function of V for each s-tuple (¢1,i2,...,7,). Then 
locally, for every v € T,M, expressed in terms of coordinates we have df,(v) = 
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yo Ofiv' for j = 1,...,n, where the functions f are the components f? = 


yiof:M—-R. Then 


(f* (dy @ dy”? @ --- @ dy"*))p(u1,---,Us) 
= (dy! @ dy” ® --- @ dy’*)(dfp(v1),..-, dfp(vs)) 
= dy" (dfp(v1)) ® dy”? (dfp(v2)) ® ++ @ dy" (dfp(vs)) 
= df" (v1) @ df? (v2) @ --- @ df**(vs) 
= (df" @ df? @--- @df**)p(v1, v2,..-,Us)- 
We conclude that in coordinates, as a covariant tensor field over M, 
fia = (Aizin-i, 0 f) df"! @ df? @ ++ df’ (5.27) 
of Of? of’ 


Oxh Oxi2  — Oair da” @ da? +++ @ dx’. (5.28) 


= (Qizini, © f) 


If w happens to be a differential form of type s, then in coordinates 


fro=fi{ So ardy’| = So (aro f)dft a. ndf*, (5.29) 


IEL(r,m) IEL(r,m) 
where we are writing I = (%1,7%2,...,7,). 


Example 5.5.2. Let M =R and let N” be a differentiable manifold. Consider an 
immersion y : R > N, which we can understand as a regular curve on N. Let w 
be a 1-form on N such that over a coordinate neighborhood of N with coordinate 
y = (Y1, Y2,-+-,Yn), We write 


w= widy' + wody? +--+ +wydy”. 


Using t as the coordinate of R, we write in coordinates 
d d dy” 
(-7*e)s = wi (4(t)) Fodt + wa(o(t)) Fedt +--+ + wnt) Fat. 


As we will see in the section, this pull back is related to calculating line integrals. 


Example 5.5.3. Consider the unit sphere S? and let (6, ~) be the usual longitude- 
latitude coordinate patch. The typical embedding of f : S? — R® corresponds to 
the functions 

f(6,~) = (cos @ sin y, sin 6 sin y, cos y). 


The dot product on R? corresponds to a covariant tensor field of type 2, expressed 
in usual coordinates (a, y, z) by 


w=dx ® dx + dy ® dy + dz @ dz. 
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With respect to the given coordinate systems, the pull-back of w is 


f*w = (—sin @sin y dé + cos @ cos y dy) ® (— sin @ sin y dé + cos 6 cos y dy) 
+ (cos sin y dO + sin 6 cos y dy) ® (cos @ sin y dé + sin 0 cos p dy) 
+ (-sin y dy) ® (sin y dy) 

= (sin? @sin? y + cos? 6 sin? y)dO ® dO 

+ (—sin 6 cos @sin y cos y + sin 6 cos 8 sin y cos y)dé @ dp 


+ (—sin 6 cos 6 sin y cos y + sin 6 cos 6 sin y cos y)dy @ dé 


+ (cos? 6 cos” y + sin” 6 cos” y + sin? y)dy @ dy 
= sin? pd0 @ dd + dy @ dy. 


As we will see, this is the standard metric tensor on the sphere with longitude- 
latitude coordinate system. 


Example 5.5.4. As an another example, whose details we leave as an exercise 
(Problem 5.5.5), we deduce the following fundamental formula. Let M and N be 
smooth manifolds of the same dimension n, and f a smooth map between them. In 
reference to a coordinate chart (U,2) on M and a chart (V,y) on N, for all p€ U, 


f*(dy' A dy? A-+- Ady”) p = (det dfp)dx' A dx? A--- A dx”. (5.30) 
We notice that if M = N = R”, then det df, is the Jacobian of the function f at 
the point p. 


The pull-back of covariant tensor fields satisfies a few properties. The proofs are 
straightforward so we leave them as exercises. 


Proposition 5.5.5. Let f : M” — N” be a smooth map between smooth manifolds. 
Let a and 8 be covariant tensor fields on N. 

1. The pull-back f* :T(T.N*®%) > T(TM*®5) is a linear function. 

2. Ifa: N +R is a smooth function, then f*(aa) = (ao f)f*a. 

3. f*(a® B) = f*(a) @ f* (8). 

4. idy(a@) =a. 


Proof. (Left as an exercise for the reader. See Exercise 5.5.2.) 


The pull-back of r-forms satisfies a few more properties. 
Proposition 5.5.6. Let f : M” — N” be a smooth map between smooth manifolds. 
The following hold for all r < min(m,n): 
1. Considering Q5(N) as a subspace of T(T.N*®*), then f*(QS(N)) C O8(M). 
2. For allw € Q"(N) andy € OF(N), f*(wAn) = (ftw) A (f*n). 
3. For allw €Q"(N) with r < min(m,n), f* (dw) = d(f*w). 
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Figure 5.5: The curve on S? in Example 5.5.8. 


Proof. Part 1 follows immediately from the functional definition in Equation (5.26). 
Part 2 is an easy application of Equation (5.29). Finally, for part 3, note that 
d(df* A --. A df‘) =0 by a repeated use of Proposition 5.4.6(2) and the fact that 
d(df*) = 0. Then if w = >>, ardx!, Equation (5.29) gives 


d(f*w) = S- d((aro f) df” A---Ad ir) 


IeEL(r,m) 

= So) d(arof)Adf" A-+-Adf” + (aro f)d(df" A+++ Adf'”) 
IEL(r,m) 

= So d(arof)Adf a---Adfm = S > d(aro f)A f*(da") 
IeEL(r,m) IeEL(r,m) 

= f*(dw). 


Proposition 5.5.7. Let f: M— N andg:U — M be smooth functions between 
smooth manifolds. Then (f og)* =g*o f*. 


Proof. (Left as an exercise for the reader.) 


Example 5.5.8. As a slightly more involved example, we revisit Example 5.4.3, 

where M = R and N = S?. Let V be the coordinate neighborhood on S? with a sys- 

tem of coordinates y defined by the parametrization x~'(u, v) = (cos usin v, sin usin v, cos v) 
defined on (0,27) x (0,77). We consider the 1-form 


w = (sin? v) du + (sin v cos v) dv. 
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Consider the function f : R > S? defined in coordinates by (u,v) = f(t) = (3t,1+ 
s sint). The image of this immersion is depicted in Figure 5.5. Defined in this way, 
we see that f1(t) = 3t and f?(t) = 1+ 4sin(t). Then 


(f*w)e = sin? (1 + 5sin(t)) 3 dt 


1 1 1 
+ sin (1 + 5 sin(t)) cos (1 + 5 sin(t)) (5 cos ) dt. 


PROBLEMS 


5.5.1. 
5.5.2. 
5.5.3. 


5.5.4. 
5.5.5. 
5.5.6. 


5.5.7. 


5.5.8. 


Prove that (5.29) follows from (5.27). 
Prove Proposition 5.5.5. 


let f: M—N be a differentiable map of manifolds. Prove that if a is a global 
section Sym*(T'N*), then f*a is a global section of Sym*(T'M*). 


Prove Proposition 5.5.7. 
Prove the formula mentioned in Example 5.5.4. 


Prove the product rule for the Lie derivative of the product between a function 
and a covariant tensor field: Let M be a smooth manifold, let f € C'(M,R), let X 
be a differentiable vector field on M, and let a be a differentiable covariant tensor 
field on M. Prove that Lx(fa) = (Lx f)a+ f(Lxa). [Hint: Use a coordinate- 
dependent approach.] 


This exercises generalizes Example 5.5.3. Let S be a parametrized surface in R?°, 
which we can think of as an immersion of 2-manifold in R?. Let (u,v) be a coordi- 
nate patch of S and suppose that the immersion of S in R? is given by a function 
F (u,v). Let 

w=dzr@®dzxz+dy ®dy+dz® dz 
be the usual dot product on (the tangent spaces of) R®. Prove that 


F*w = (Fy: Fy)du @ du + (Fu: Fy)du @ du + (Fy: Fu)dv ® du + (Fy: Fy)dv @ dv, 
where by Ff, - Fu, we mean the dot product of the vector F, with itself, and so on. 


Let M = RP? be the manifold of the real projective plane. (Recall Example 3.1.6.) 
We use the homogeneous coordinates (x: y: z) with (x, y, z) # 0 to locate points 
in RP?. Define the three open sets Ui = {(a: y : z) € RP? |x 4 0} and similarly 
Uz and U3 where y 4 0 and z # 0 respectively. We define the coordinate maps 
dé: : Uy > R? as bi(a: y : 2) = (y/a,2z/x), and similarly for ¢2 and ¢3. Use 
coordinates (u,v) for the (U3, 63) chart and (r,s) for the (U2, ¢2) chart. 


(a) Setting (u,v) = ¢32(r,s), show that ¢32(r,s) = (r/s,1/s) and determine 
d@32. 


(b) Consider the function f : RP? + R defined by f(x: y: z) = 38yz/(a? +2y?4+ 
z*). Show that this function is well-defined on all of RP?. 


(c) Show that over U3 we have 


U uz — vy 
(63°) (df) = ae ate a) 


dv. 
(u? + 2v? + 1)? e 
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(d) Determine the expression of df in the U2 coordinate chart, namely determine 
($5 +)*(df), and then show directly that 


($5 ')* (df) = $32((¢2')*(df)). 


5.6 Lie Derivative of Tensor Fields 


As promised in earlier sections, we want to develop the notion of a type of derivative 
on tensor fields. Now that we have the notion of a pull-back of a covariant tensor 
field at our disposal, we can extend Definition 5.3.5 of the Lie derivative to tensors 
of any type. We start with Lie derivatives of covariant tensor fields. 


Definition 5.6.1. Let M be a C?-manifold and let X € X(M) be a differentiable 
vector field on M and let y; be the flow of X on M. If a € [(TM*®%) is a smooth 
covariant tensor field on M, we define the Lie derivative of a by X as the covector 
field given by 


(Cxa)p “ F(y)*(a)elicg = lim +(hIe(ente)) —ap)- (6.81) 


Aso h 
We emphasize that for all h near 0, the difference (Y7,)p(Qy, (p)) — Mp is a differ- 
ence of elements in T,.M*®* so it makes sense. This is a coordinate-free description 
of the Lie derivative. 


Proposition 5.6.2. Let (U,x) be a coordinate chart on a manifold M, let X be a 
differentiable vector field on M and let a be a covariant tensor field of type (0, 8). 
Suppose that with respect to the coordinate chart (U, x), the components of X are X* 
and the components of a are Qj, ;,...3,. Then the components of the Lie derivative 
of @ are given by 


Sense k 
_ yk Qin joie 4 Ox Dee 
xk — Ogi “Meds 
xk Ox® 
Ayia Chbde FT AG, Cade jo—ak 


(LQ) jrjode 


Proof. Let p € U and let v1, v2,...,Us be s arbitrary vectors in T,M. Then 


d 


(Lxa)p(V1, v2, xe , Us) = ay Pt) p (M2, 02, oe i 


7 © op) ((dge)p (01), (dvr) p(v2), ea t29 (dvt)p(vs))|,-0 


d av" 4 Av”? du! 
(vnsea tee Fey Pt yte.., SF vy 


~ dt Or" + Agee “1 Oxts t=0 
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(LxQ)p(v1, UD, «65 


This expression corresponds to s summations and on each term we involve a product 
rule with s + 1 functions in the parameter t. Using (5.19) and (5.20), the product 
rule gives 


(Lx Q)p(V1, V2,---,Us) 
00; 59-5, AVE 7 Ox? - 
= Be ool + Of + Qj joj. (P0(P)) Fa Oe . 07 crass 


ig ODO’ a. 9 £ 
4 ais eoC@ A 2 OE) ale oe 


After relabeling the indices of summation as necessary, we find that 


O0%j, jo--Js k 4 es ee Sa aXx* - F k vl yi? ... yds 
X t 7 ods ; fy 
Ak Oxi "924 Ayjs  I192-Is-1 1 "2 pied 


, Us) es 


with all component functions evaluated at p. The proposition follows. 


So far we have defined the Lie derivative on (a) functions on M, (b) vector fields 
on M, and (c) covariant tensor fields on M. The latter case includes covector fields 
and k-forms. Before we give a complete definition for the Lie derivative, we consider 
how the Lie derivative interacts with various operations on tensors or forms that 
we have introduced so far. 

Problem 5.5.6 generalizes to contraction of a vector field with any covariant 
tensor field. This establishes the following proposition. 


Proposition 5.6.3. Let X be a vector field on M and let a be any covariant 
tensor field of rank (0,8). Then if Y1,Y2,...,Ys are s vector fields on M, then the 
Lie derivative of the contraction is 


L£x(a(¥i,...,¥s)) = (Lxa)(Yi,---,¥s) + a(Lx¥1, Yo, ¥e)+ 
seep a(Yi, Yo, see LxYs). 


Finally, let f € C?(M,R) be a differentiable function on M. Then the differential 
df is a 1-form, i.e., a smooth covariant vector field. 


Proposition 5.6.4. For any differentiable vector field X on M, the operators Lx 
and d commute on C?(M,R). In other words 


Lx (df) = d(L£xf). 


Proof. (Left as an exercise for the reader. See Problem 5.6.1.) 


Before defining the Lie derivative for a general tensor field, we list a few results we 
have established so far. Let M be a differentiable manifold, let f be a differentiable 
function on M, let X,Y, Z,Yi,..., Ys; be vector fields on M, and let a be a covariant 
tensor field of type (0,s) on M. 
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. Definition 5.3.5: Lx f = X(f). 

. Theorem 5.3.10: £xY = [X,Y]. 

. Linearity. Proposition 5.3.9: Lx(Y + Z) =LxY +LxZ. 

. Product rule. Proposition 5.3.9: Lx(fY) = (Lx f)Y¥ + f(LxY). 


. Contraction. Proposition 5.6.3: 


o Fe wo eo Fe 


L£x(a(¥%,..-,¥s)) = (Lxa)(%,..., Ys) 
+a(L£xVij-.., Ye) #ese + e(%1,.0,LxY,). 


6. Proposition 5.6.4: Lx od =do Ly on functions. 


We can now present a definition of the Lie derivative on tensors of type (r, s) 
with r > 2 or with r=1 and s>0. 


Definition 5.6.5. Let X be a differentiable vector field on a manifold M. For all 
pairs (r,s) of nonnegative integers, we define the Lie derivative £x as the linear 
transformation on the vector space [([M®" ® TM*®s) of tensor fields satisfying 
Definition 5.3.5 for vector fields, Defintion 5.31 for covector fields, as well as the 
product rule 


Lx(S@T) = (LxS)®@T+S5® (LxT) (5.32) 
for any tensor fields S and T. 


It is not hard to show that the full Definition 5.31 for any covariant tensor field 
satisfies the product rule (5.32) applied to tensor products of covariant tensor fields. 
By virtue of the properties already established, imposing this additional product 
rule allows us to define the Lie derivative on any tensor field. 

We took a coordinate-free approach to defining the Lie derivative. This is es- 
sential to know that this construction has mathematical meaning. The following 
proposition gives the coordinate dependent description of the the Lie derivative. 
The proof of this proposition is left as an exercise. Furthermore, this proposition 
gives a coordinate dependent way to show that the Lie derivative of a tensor field 
of type (r,s) is again a tensor field of type (r,s). (See Problem 5.6.3.) 


Proposition 5.6.6. Let M be a smooth manifold and let X € X(M). Let A € 
I(TM®" @ TM*®*) be a smooth tensor field of type (r,s). Suppose that over some 
coordinate chart (U,) of M, the components of X are X* and that the components 
of A are Ars Then the components of Lx A are 


ty ta--tr iy ir 1, 
(L Ayes = xk OA Gia. Ox kige-ir 6 teen Ox Aira trk 
XO) ji jo-Js Ae Oxk © Sidevds Ork © Sidevds 


Xe 2 ae ee 
OX girtewic 4... 4 OX giniein (5,33) 


T oan kjaeJs Oris Jij2-Js—1k* 
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Intuitively speaking, the Lie derivative of a tensor field generalizes the concept 
of a directional derivative in R” to any manifold and applied to any tensor field. 

We finish this section with the Cartan formula, also called the Cartan magic 
formula. The result is interesting in itself but the proof is interesting as well since it 
affords us the opportunity to use some of the more algebraic techniques presented 
in Section 4.7. 

In Problem 5.4.16 we discussed the interior product of a vector field X with an 
r-form w, written ixw. This interior product is essentially the contraction of X 
with the first component of the r-form but we must remember that for an r-tuples 
of indices, i, < ig <--+ < dp, 


dx®™ A.» Ada = A(dz" @---@ dz"). 


Proposition 5.6.7 (Cartan Formula). Let X be a differentiable vector field on a 
smooth manifold M. Then as operators Q°(M) > 2°(M), 


Ly =doix +ixod. 


Before proving the Cartan formula, we point out one of the reasons this result 
might be surprising. The operators involved are shown in the following diagram. 


0°(M) d Qstl(M) 
| E 
ae-1(M) 05(M) 


This diagram is not commutative, i.e., that generally ix od is not equal to dozx. 
However, it is interesting to see the Lie derivative £, decomposes into a part that 
goes through 2°*1(M) and another part that goes through 0°~1(M). 


Proof of Cartan formula. By definition of the Lie product, since it obeys the Leibniz 
rule, £x is a derivation on the algebra of differential forms 0°(M). 
Now let w € 0"(M) and 7 € 0%(M). Then using the result of Problem 5.4.16, 


(doix tix od)(w An) 
= d(ix(wAn)) +ix(d(w An) 
= d((ixw) An + (-1)"w A (ixn)) + ix (dw An + (-1)"w A dn) 
= d(ixw) An + (-1)"*(ixw) A (dn) + (—1)" (dw) A (ix) 
+ (-1)*'w A (d(ixn)) + (ix (dw)) An + (-1)"*? (dw) A (ixn) 
+ (-1)"(ixw) A (dn) + (-1)?"w A (ix (dn) 
= (d(ixw) +ix(dw)) An tw A (d(ixn) +ix(dn)). 
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Thus, the operation doix + ix od is a derivation on 0°(M). 

We prove the Cartan formula by using Proposition 4.7.12 and observing that 
over every coordinate chart (U,2) of M, as an algebra, °(U) is generated by 
0°(U) = C~®(U,R) and the 1-forms dz’. 

We first prove that Lx and doix +ix od are equal on C™(U, R). By definition, 
Lx f = X(f) for all f € C°(M,R). With respect to a coordinate system, X(f) = 
X‘0;f. On the other hand ix f = 0 by definition so in coordinates 


(ix od+doix)(f) =ix(df) = ix (0;fdx!) = X40; f. 


This shows that Ly and ix od+doix agree on the set of differentiable functions. 
Considering the 1-forms dx’, by Proposition 5.6.6, Lx (dx) = 0X*/dxdxI and 


: . : Xt, 
(doix +ix 0 d)(da") = d(ix(dz")) = d(X") = 2 3 dx’. 
x 
Thus, Lx and ix od+doix agree also on dz’ for all i=1,2,...,n. 


We have shown that for any coordinate chart U of M, the operations £x and 
ix od+doix are derivations on 2°(U) that agree on a generating set of N°(U). By 
Proposition 4.7.12, they are equal on 0°(U) . Since this is true for any coordinate 
chart, Lx = (ix od+do ix) on Q°(M). 


PROBLEMS 

5.6.1. Prove Proposition 5.6.4. [Hint: Use a coordinate dependent approach.| 

5.6.2. Prove Proposition 5.6.6. 

5.6.3. Let X’ represent the components of a vector field on a manifold and let Ane 
be the components of a tensor field of type (r,s). Following a similar coordinate- 
dependent approach as taken in Example 4.5.9, prove that the collection of func- 
tions defined on the right hand side of (5.33) form the components of a tensor field 
of type (r,s). 

5.6.4. Let X and Y be vector fields and let T be any tensor field. Prove that Lix,yjT' = 
LxLyT —LyLxT. consequently, we can write as operators Lix,y] = Lxly — 
Ly Lx. 


5.7 Integration on Manifolds - Definition 


The sections in the chapter so far discussed vector fields and tensor fields on mani- 
folds, and two methods that provide a sort of derivative, namely the exterior differ- 
ential on r-forms and the Lie derivative by a vector field. The remaining sections 
present the theory of integration on manifolds. This section develops the definition 
of integration, Section 5.8 presents calculations and applications with integration, 
and finally Section 5.9 discusses Stokes’ Theorem. 

The theory of integration on manifolds must generalize all types of integration 
introduced in the usual calculus sequence. This includes 
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e integration of a one-variable, real-valued function over an interval; 

e integration of a multivariable, real-valued function over a domain in R”; 
e line integrals of functions in R”; 

e line integrals of vector fields in R”; 


e surface integrals of a real-valued function defined over a closed and bounded 
region of a regular surface; 


e surface integrals of vector fields in R%. 


One of the beauties of differential forms is that they will allow for a single concise 
description that does generalize all of these types of integrals. 

Readers may be aware of the difference between Riemannian integration, the 
theory introduced in the usual calculus sequence, and Lebesgue integration, which 
relies on measure theory. The theory developed here does not inherently depend 
on either of these theories of integration but could use either. The definitions for 
integration on a manifold use the fact that a manifold is locally diffeomorphic to an 
open subset in R” and define an integral on a manifold in reference to integration 
on R”. Therefore, we can presuppose the use of either Riemannian integration, 
Lebesgue integration, or any other theory of integration of functions over R”. 


5.7.1 Partitions of Unity 


The basis for defining integration on a smooth manifold M” relies on relating the 
integral on M to integration in R”. However, since a manifold is only locally 
homeomorphic to an open set in R”, one can only define directly integration on a 
manifold over a coordinate patch. 

We begin this section by introducing a technical construction that makes it 
possible, even from just a theoretical perspective, to piece together the integrals of 
a function over the different coordinate patches of the manifold’s atlas. 


Definition 5.7.1. Let 1 be a manifold, and let V = {Va}aer be a collection of 
open sets that covers M. A partition of unity subordinate to V is a collection of 
continuous functions {Wa :M — R}ae, that satisfy the following properties: 

1. 0<vd.(x) <1 for alla€ J andallae M. 

2. w(x) vanishes outside a compact subset of V,,. 

3. For all « € M, there exists only a finite number of a € I such that w(x) 4 0. 

4. acer Va(z) = 1 for all ce M. 


The summation in the fourth condition always exists since, for all x € M, it is 
only a finite sum by the third criterion. Therefore, we do not worry about issues 
of convergence in this definition. The terminology “partition of unity” comes from 
the fact that the collection of functions {Ww} add up to the constant function 1 on 
M. 
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Figure 5.6: f(z) = e"/* if x > 0 and O if a <0. 


Theorem 5.7.2 (Existence of Partitions of Unity). Let M be a smooth manifold 
with atlas A = {(Ua,¢a)}aer- There exists a smooth partition of unity of M 
subordinate to A. 


For the sake of space, we forgo a complete proof of this theorem and refer the 
reader to [33, pp. 54-55], [49, Theorem 10.8], or [15, Section 14.1]. The proof relies 
on the existence of smooth real-valued functions that are nonzero in an open set 
U C R” but identically 0 outside of U. Many of the common examples of partitions 
of unity depend on the following lemma. 


Lemma 5.7.3. The function f : R— R defined by 


no= |" ifx <0, 


e/t if >0, 


is a smooth function. (See Figure 5.6.) 


The proof for this lemma is an exercise in calculating higher derivatives and 
evaluating limits. Interestingly enough, this function at « = 0 is an example of a 
function that is smooth, i.e., has all its higher derivatives, but is not analytic, i.e., 
equal to its Taylor series over a neighborhood of x = 0. 

The function f(a) in Lemma 5.7.3 is useful because it passes smoothly from 
constant behavior to nonconstant behavior. This function f(a) also leads immedi- 
ately to functions with other desirable properties. For example, f(a —a)+bisa 
smooth function that is constant and equal to b for x < a and then nonconstant for 
x >a. In contrast, f(a — x) + b is a smooth function that is constant and equal 
to b for x > a and then nonconstant for  < a. More useful still for our purposes, 
if a < 6, the function g(x) = f(a — a)f(b— x) is smooth, identically equal to 0 for 
x € (a,b), and is nonzero for x € (a,b). We can call this a bump function over (a,b) 
(see Figure 5.7). Also, the function 


f(o— 2) 
f(x—a) + f(b— 2) 


h(x) = (5.34) 
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Figure 5.7: Bump function. Figure 5.8: Cut-off function. 


is smooth, is identically equal to 1 for x < a, identically equal to 0 for x > b, 
and strictly decreasing over (a,b). The function h(x) is sometimes called a cut-off 
function (see Figure 5.8). 


We will illustrate how to construct partitions of unity over a manifold with the 
following two simple examples. 


Example 5.7.4. Consider the real line R as a 1-manifold, and consider the open 
cover U = {U;}, where U; = (i— 1,141). In this open cover, we note that if n 
is an integer, then n is only contained in one set, U,,, and if t is not an integer, 
then ¢ is contained in both Uj,, and U|,)41. Consider first the bump functions 
gi(x) = f(x — (4 —0.9)) f((¢@ + 0.9) — x) which has 


0, ifa<i-0.9, 
gic) =< etree 0). fe 0a =< e108, 
0, if >i+0.9, 


where we use the function f as defined in Lemma 5.7.3. It is not hard to show 
that these functions are smooth. Furthermore, by definition, g;(2) = 0 for x ¢ 
[i — 0.9,2 + 0.9] = K;, which is a compact subset of U;. For any 7 € Z, the only 
functions that are not identically 0 on U; are g;-1, gi, and gij41. Now define 


pate gi(Z) 
Wilz) = gi—1(@) + gi() + gigi (2) 


We claim that the collection {w;};¢z forms a smooth partition of unity subordinate 
to U. Again, w;(2) 4 0 for « € K; and w(x) = 0 for « ¢ K;. Furthermore, the only 
functions ~, that are not identically 0 on U; are y;_-1, w;, and W441. If fs =n is an 
integer, then 


S- wi(a) = U,(n) Jn(n) = Jn(m) =e 


= ~ gn—1(n) + gn(n) + 9n4i(m) gain) 
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Figure 5.9: Example 5.7.5. 


If instead x is not an integer, then when we set n = |x|, we have 


So ila) = dnl) + bnaala) 
tEZ 
= n(x) Gn+1(2) 
Gn—1(L) + On(L) + Yn+i1(L) — Gn(L) + Gn41(L) + Gn+2(z) 
zo) In(2) 9n41(2) are 
9n(®) + 9n4i(2) 


Gn (x) + In+1(2) 


since gn—1(@) = Gn42(a) = 0 for x € Un, NU ni. 


Example 5.7.5. Consider the unit sphere S? given as a subset of R°. Cover S? 
with two coordinate patches (U;,2) and (U2,%), where the coordinate functions 
have the following inverses: 


x '(u,v) = (cos usin v, sin usin v, cos v) 
z'(u,v) = (—costsin¥, — cosv, — sin usin v) 
for (u,v) € (0,27) x (0,7). 
Define now the bump functions 
g(u,v) = f(u-O1)F(6—u)fl(v— 0.1) F3—»v), 
g2(U, ¥) = fu— OA) F(6— a) flo —0.1)F3—>%), 
where f is the function in Lemma 5.7.3. These functions are smooth and vanish 


outside [0.1,6] x [0.1,3] = AK, which is a compact subset of (0,27) x (0,7). Define 
also the bump functions h; :S? + R by 


h (p) = 91 0 x(p), if p € U1, and ho(p) = 92 ° &(p), if p € U2, 
: 0, ifp¢Ur, 0, if p ¢ Us. 
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By construction, these functions are smooth on S? and vanish outside a compact 

subset of U; and U2, namely, x~'(K) and z~1(K) respectively. In Figure 5.9, the 

half-circles depict the complements of U, and U2 on S?, and the piecewise-smooth 

curves that surround the semicircles show the boundary of x~!(K) and z~1(K). 
Finally, define the functions 7); : S? + R by 


_ h;(p) 
Ve) = FG ha) 


These functions are well defined since h, and hg are nonzero on the interior of 
x !(K) and -1(K), respectively, and these interiors cover S?. The pair of functions 
{v1, v2} is a smooth partition of unity that is subordinate to the atlas that we 
defined on S?. 


An object that recurs when dealing with partitions of unity is the set over which 
the function is nonzero. We make the following definition. 


Definition 5.7.6. Let f : WM — R be a real-valued function from a manifold M. 
The support of f, written Supp f, is defined as the closure of the non-zero set, i.e. 


Supp f = {p € M| f(p) # 0}. 


A function is said to have compact support if Supp f is a compact set. 


With this terminology, the second criterion concerning functions in a partition 
of unity {Wa}aez subordinate to a given atlas is that each function has a support 
that is compact and in an open set of the atlas. 


5.7.2 Integrating Differential Forms 


We are now in a position to define integration of n-forms on a smooth n-dimensional 
manifold. We must begin by connecting integration of forms in R” to usual inte- 
gration. 


Definition 5.7.7. Let w be a differential n-form over R”. Let K be a compact 
subset of R”. If 
w= f(ar',...,02") dr A--- Adz”, 


then we define the integral 


pet] flct,...,2") de da? da” = f f dv, 
K K K 


where the right-hand side represents the usual Riemann integral. 


(As pointed out at the beginning of this section, we can also use the Lebesgue 
integral instead of the Riemann integral.) Also, if w is a form that vanishes outside 
a compact set A, which is a subset of an open set U, then we define f,,w = Ji. w. 

In order to connect the integration on a manifold M” to integration in R”, we 
must first show that this can be done independent of the coordinate system. 
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Lemma 5.7.8. Let M” be a smooth, oriented manifold with atlas A = {(Ui, $i) }ier. 
Let K be a compact set with K C Uy U2, and let w be an n-form that vanishes 
outside of K. Setting V; = ¢;(U;) fori = 1,2, then the following integrals are equal 


: ry!) = i. (ry ). 


Proof. Using the standard notation for transition functions, write Vag = dba(Ua M 
Ug) and ¢21 = $20 o; a homeomorphism from Viz to Vo;. We use coordinates 
(x!,...,2”) on the chart (U;, 1) and (y',...,y”) on the patch (U2, $2). We write 


(,')*(w) Satie ae A de 


(¢2")*(w) = f(y’, -..,y”) dy A+++ Ady” 


as n-forms in R”, which by hypothesis are zero outside of Viz and V2; respectively. 
According to Definition 5.7.7, 


[ore = f He, y2") det ds" 
[ err es f fet ar ay" 


We note that ¢;' = ¢;' 0 d12 so (¢5')* = o%, 0 (¢,')*. Hence 


Fyis---29") dy! A+++ Ady" = (63')*(@) = $42 ((67')*e) 
= $io(f(a",..+, 2%) dz’ A-+-A dz”) 
= (fo di2)(y',...,y”) (det doi2)dy' A --- A dy”, 


where the last equality follows from (5.30). Since the manifold is oriented, det d¢i2 > 
0. Consequently, we have 


i: (¢3')*(w) = ii (f ° diz) (y', late ,y”)(det dd 2) dy' ++ dy” 
Vo1 Vou 
: | (fo dr2)(y",--.,y")| det dora] dy? + - dy” 
a fey ope) dot van? = (¢7')*w), 


Vi2 


where the second to last equality holds by the usual substitution-of-variables formula 
for integration. 


This lemma justifies the following definition in that it is independent of the 
choice of coordinate system. 
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Definition 5.7.9. Let M be an oriented, smooth n-dimensional manifold. Let w 
be an n-form that vanishes outside of a compact subset K of M, and suppose that 
K is also a subset of a coordinate neighborhood (U,¢). Then we define the integral 


as 
[e= i, ov 


where the right-hand side is an integral over R” given by Definition 5.7.7. 


This definition explains how to integrate an n-form when it vanishes outside a 
compact subset of a coordinate patch. If this latter criterion does not hold, we use 
partitions of unity to piece together calculations that fall under Definition 5.7.9. 

If a manifold M is not orientable, then for any atlas there will exist two co- 
ordinate charts ¢, and ¢g such that det(d(¢g o ¢Z')) < 0. From the proof of 
Lemma 5.7.8 integrating a form over the intersection of these two coordinate charts, 
with respect to one chart versus the other, will give a difference of signs. Then Def- 
inition 5.7.9 is not well-defined. Consequently, it is impossible to define integration 
over a non-orientable manifold. On the other hand, if M is non-orientable and 
U is an open subset of M, it may be possible that U is orientable. In this case, 
Definition 5.7.9 applies. 


Definition 5.7.10. Let M” be an oriented, smooth manifold, and let w be an 
n-form that vanishes outside a compact set. Let {wij}ier be a partition of unity 
subordinate to the atlas on M. Define 


=). ee 


i€l 
where we calculate each summand on the right using Definition 5.7.9. 


The summation only involves a finite number of nonzero terms since w vanishes 
outside a compact set. The reader may wonder why we only consider forms that 
vanish outside of a compact subset of the manifold. This is similar to restricting 
one’s attention to definite integrals in standard calculus courses. Otherwise, we face 
improper integrals and must discuss limits. As it is, many manifolds we consider 
are themselves compact; in the context of compact manifolds, the requirement that 
w vanish outside a compact subset is superfluous. 

The next proposition outlines some properties of integration of n-forms on n- 
dimensional manifolds that easily follow from properties of integration of functions 
in R” as seen in ordinary calculus. However, we first give a lemma that restates the 
change-of-variables rule in integration over R”. 


Lemma 5.7.11. Let A and B be compact subsets of R". Let f : A— B be a smooth 
map whose restriction to the interior A° is a diffeomorphism with the interior B°. 
Then on A®°, f is either orientation-preserving or orientation-reversing on each 
connected component. Furthermore, 


fens re 
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where the sign is +1 (respectively, —1) if f is orientation-preserving (respectively, 
orientation-reversing) over A. 


Proof. By the Inverse Function Theorem, f~' is differentiable at a point f(p) if 
and only if df, is invertible and if and only if det df, 4 0. Since each component 
function in the matrix of df, is continuous, then det df, is a continuous function from 
A to R. By the Intermediate Value Theorem, det df, does not change signs over 
any connected component of A°. Thus, f is orientation-preserving or orientation- 
reversing on each connected component of A°. 

Let (x1, 2?,--- ,2”) be coordinates on A C R” and (y', y?,--- ,y”) coordinates 
on B C R”. Then we can write w = ady! A --- A dy” for a smooth function 
a:R" > R. By Problem 5.5.5, 


fiw =ao f(detdf)dr' A--- Adz”. 


Furthermore, according to the change-of-variables formula for integration in R” (see 
[55, Section 16.9, Equations (9) and (13)]) in the usual calculus notation, we have 


if ady' dy? ---dy” =| ao f | det df| da’ dx? --- dx”. 
B A 
Therefore, if f is orientation-preserving on A, 
i w= f ady' dy? --- dy” =i ao f | det df| da’ dx” --- dx” 
B B A 
a if ao f(det df) dx! dx”... da” = i few. 
A A 


If f is orientation-reversing, the above reasoning simply changes by | det df| = 
— det df and a —1 factors out of the integral. 


Proposition 5.7.12 (Properties of Integration). Let M and N be oriented, smooth 
manifolds with or without boundaries. Let w and n be smooth forms that vanish 
outside of a compact set on M. 


1. Linearity: For all a,b ER, | (aw + bn) =af wtf 1. 
M M M 


2. Orientation change: If we denote by (—M) the manifold M but with the op- 


posite orientation, then 
f ge | wi 
(—M) M 


&. Substitution rule: Ifg: N — M is an orientation-preserving diffeomorphism, 


[oq foe. 
M N 
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Proof. Part 1 is left as an exercise for the reader. 

If M is an oriented manifold with atlas {(Ua, da) baer, then equipping M with an 
opposite orientation means giving a different atlas {(Vg, oa)}ge such that det d(¢go 
¢,') < 0 whenever dp o ¢,' is defined. Following the proof of Lemma 5.7.8, one 
can show from the reversal in orientation that 


pee 
(—K) K 


for any compact set K in any intersection U, Vg. Hence, by using appropriate 
partitions of unity and piecing together the integral according to Definition 5.7.10, 
we deduce part 2 of the proposition. 

To prove part 3, assume again that w is compactly supported in just one co- 
ordinate chart (U,¢) of M. Otherwise, using a partition of unity, we can write 
w as a finite sum of n-forms, each compactly supported in just one coordinate 
neighborhood. Without loss of generality, suppose that g~!(U) is a subset of a 
coordinate chart (V,w) on N. Saying that g is orientation-preserving means that 
det(¢ogow') > 0. Since g-!(U) C V, then V contains the support of g*w. Now, 
by applying Lemma 5.7.11 to the diffeomorphism ¢0 g 0 77, we have 


2, —1)\#,, — ogous )* —1l)\*,, — ly ogouw')* 
Le-L oY Jw Ly¢ gow7)"(671)"w Ly? pogoplytw 


In calculus we defined line integrals along piecewise-smooth curves or surface 
integrals on piecewise-smooth surfaces. Though we have not, to this point, defined 
piecewise-smooth manifolds, we can do so in a way that allows us to give a definition 
of the integral over a piecewise-smooth manifold. 


Definition 5.7.13. A piecewise-smooth manifold M is a topological manifold that 
is the finite union of smooth manifolds MM), Mo2,..., Mx that intersect only on their 
boundaries. A piecewise-smooth manifold is oriented if each manifold M; is oriented 
in such a way that if M; and M; intersect along a boundary component C’, then the 
orientation induced on C' from M; is opposite the orientation induced from M;. 


Definition 5.7.14. Let MM” be a piecewise-smooth manifold as in Definition 5.7.13. 
Let w be an n-form that is smooth on each piece M;. Then we defined the integral 


ami wtet | Ww. 
M M, Mr 


PROBLEMS 


5.7.1. Prove that a function f : R > R that is identically 0 for x < 0 and positive for 
x > 0 cannot be analytic. 
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5.7.2. The manifold RP® is orientable. Let U; = {(x' : 2? : a3 : at) € RP? |a; 4 0} be 
the coordinate open set as described in Example 3.1.6. Use A = {(Ui, ¢i)}4.1 as 
the atlas for RP?. Define 


evyiziw =| 


7 w?/ (40? 2? ~y? 2?) 


if a? + y? +2? < 4w? 

0 otherwise. 

Define also hi(x! : 2? : 2°: 24) = f(a? : a3: 24:21), he(a': a? : 22: 24) = (2°: 

x’ :a': a7) and similarly for h3 and ha. Finally define 
hi(at : 2? : 2? : a) 

Daas ee a at) 

(a) Prove that f (and hence y; for i = 1,2,3,4) is a well-defined function on 
RP*. 

(b) Prove that f is smooth. 

(c) Prove that {wW1, 2, v3, Wa} is a smooth partition of unity of RP? subordinate 
to A. 


5.7.3. Prove that Proposition 5.7.12 holds for oriented piecewise-smooth manifolds. 


W(x :a%: 2°: 2%) = 


5.8 Integration on Manifolds - Applications 


The reader might have noticed the impracticality of n-forms on an n-dimensional 
manifold from the definition. By virtue of the structure of a manifold, simply to 
provide a consistent definition, we are compelled to use a formula similar to that 
presented in Definition 5.7.10. On the other hand, integrals involving terms such as 
e—!/* or bump functions as described in Example 5.7.5 are intractable to compute 
by hand. 

The following useful proposition gives a method to calculate integrals of forms 
on a manifold using parametrizations while avoiding the use of an explicit partition 
of unity. The proposition breaks the calculation into integrals over compact subsets 
of R”, but we need to first comment on what types of compact sets we can allow. 
We will consider compact sets C C R” whose boundary 0C has “measure 0.” By 
“measure 0,” we mean ee 1dV = 0. More intuitively, we do not want C' to be 
strange enough that its boundary OC has any n-volume. 


Proposition 5.8.1. Let M™ be a smooth, oriented manifold with or without bound- 
ary. Suppose that there exists a finite collection {C;}*_, of compact subsets of R™, 
each with boundary OC; of measure 0, along with a collection of smooth functions 
F,: C; — M such that: (1) each F; is a diffeomorphism from the interior C? 
onto the interior F;(Ci)° and (2) any pair F;(C;) and F;(C;) intersect only along 
their boundary. Then for any n-form w on M, which has a compact support that is 
contained in F\(C,) U---U Fi (C,), 


w= ey ) Pig: (5.35) 
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Proof. We need the following remarks from set theory and topology. Recall that 
for any function f : X — Y and any subsets A,B of Y, we have f~!(AU B) = 
f-l(A)U f-1(B) and f-1(AN B) = f-1(A)N f71(B). For general functions, the 
same equalities do not hold when one replaces f with f~!. However, if f is bijective, 
the equality does hold in both directions. 

Let A = {(Ua, ba) baer be the atlas given on M. Let K be the support of w. 
Note that since each F; is continuous, then F;(C;) is compact. 

Suppose first that K is a subset of a single coordinate chart (Uo,¢). Since 
KC F\(Ci)U---U Fy(Cz), 


K = K(Fi(C1)U---U Fe(Ck)) = (F(x) 9K) U-- U (Fi (Cx) 9), 
and, again, because ¢ is a bijection, 
o(K) = (go Fy(Cy) A o(K)) Use-U (go Fy(CR) A K). (5.36) 


Since ¢ is a homeomorphism and since any pair F;(C;) and F;(C;) intersect only 
along their boundaries, then the same holds for any pair KM F;(C;) and KN F;(C;) 
and also for any pair ¢()M (do F;)(C;) and @(K) N (¢ 0 F;)(C;). 

By definition of integration of n-forms over a coordinate chart, i.e., Definition 


5.7.9, 
he ‘ he a he? 


By Equation (5.36) and the theorem on subdividing an integral by nonoverlapping 
regions in R” (see [55, Section 16.3, Equation (9)] for the statement for integrals 


over R?), 
k 


W= (¢-*)*w. 
yy > eee 


Note that this is precisely where we need to require that the C;, have boundaries of 
measure 0. 

The setup for the proposition was specifically designed to apply Lemma 5.7.11 
to the function 60 F; : C; > (@0 F;)(C;) for each 7 € {1,...,k}. We have 


(67)'w = i; ($0 F)"(g})"w 
) FOM(K)NCi 


=i Ftu= [ Pw, 
Fo (K)NC; Ci 


and the proposition follows for when K is a subset of a single coordinate chart. 

If k is not a subset of a single coordinate chart, we use a partition of unity 
subordinate to the atlas of M. In this case, the proposition again follows, using 
Proposition 5.5.6(2), so that for each partition-of-unity function w,;, we have 


Fe (jw) = (Wj 0 Fi) Ff w. 


Viecceed 
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With this proposition at our disposal, we are finally in a position to present 
some examples of integration of n-forms on a smooth n-manifold. 


Example 5.8.2. Consider the 2-torus M = T? = S! x S!. We choose an atlas on 
M so that one of the coordinate functions is ¢ : M — (0,27)? corresponding to 
pairs of angles going around each S!. It should be clear by now that in order to 
express an n-form explicitly, we need a coordinate-depend description. Let w be a 
2-form on M such that 


(¢-1)*(w) = (3+ cos v)* cos v du A dv. 


Then from Proposition 5.8.1 we calculate that 


27 27 
| w= f ‘i (3 + cos v)? cosv du du 
M 


= =f 3cosv + 6cos? v + 9cos® v du = 1277. 
0 


This example illustrates a special case of a particular situation. We often think 
of the torus as an embedded submanifold of R?. This motivates the following 
definition. 


Definition 5.8.3 (Integration on Submanifolds). If M is an immersed submanifold 
of dimension m with the immersion f : M™ > N” and if w € N™(N), then we 


define 
| u=f f*w. (5.37) 
f(M) M 


This definition applies in particular to embedded submanifolds. 


Example 5.8.4. We revisit Example 5.8.2 to show how it relates to Definition 5.8.3. 
Suppose that we embed the torus in R® using the parametrization 


F(u,v) = ((3 + cosv) cosu, (3+ cosv)sinu,sinv) — for (u,v) € [0, 27]?. 


Notice that this parametrization is described by F = f o d~', where ¢ is the coor- 
dinate chart described in Example 5.8.2 and f : M > R? is the actual embedding 
function of the torus into R°. 

Consider the 2-form on R® defined by n = —ydx A dz+axdy A dz. We calculate 
that 


F* (dx A dz) = d((3 + cosv) cosu) A d(sin v) 
= —sinucosv(3 + cosv) du A du, 
F* (dy A dz) = d((3 + cos v) sin u) A d(sin v) 


= cos ucosv(3 + cosv) du A dv. 
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Thus, 


F*n = —(3 + cosv)sinu F* (dx A dz) + (3 + cos v) cos u F* (dy A dz) 
= (3+ cosv)* cosu du A dv. 
So F*n = (¢~1)*(w) from the previous example. Since F* = (¢~')* o f*, we see 


that the form w chosen in Example 5.8.2 is f*7. So by Definition 5.8.3, we connect 
these integrations by 


ia . ii PaaS i = [0 


=} (3 + cos v)? cosv du dv = 12n?, 
[0,2]? 


which we calculated in Example 5.8.2. 


Example 5.8.5. As another example, consider the unit sphere S? in R? covered 
by the six coordinate patches described in Example 3.1.5. Adjusting notation to 
Fi = Xi) and Fy = Xa); we observe that if we use the compact set Cy, = C2 as 
the closed unit disk {(u,v) | u2 +v? < 1}, then the sphere can be covered by Fi (C1) 
and F5(C2). Thus, we have k = 2 in the setup of Proposition 5.8.1. 

Consider the 2-form w = xz dy A dz on S?, with the x,y,z representing the 
coordinates in R?. We have 


F,(u,v) = (u,v, Vl —u?—v?) and Fo(u,v) = (u,v, -V 1 — u? — v2), 


so we calculate that 


Few = u(l —u? — y?)8/? dvr ( is du = dv) 
V1 —u? — v? V1l—u? —v? 


= u?(1—u? — uv?) dud du, 


and similarly, Fw = u?(1— u? — v?) du A du. Then by Proposition 5.8.1, 


fe=f Frwt+ Fiw =2 [ u?(1—u? — v?) dudv. 
S? C1 C2 C1 


Putting this in polar coordinates, we get 


20 1 
| w= 2 | | r? cos” 0(1 — r?)r dr dO 
2 o JO 
20 1 rT 
2(f cos? 8) @ P-rar) =F 
0 0 6 


An important case of integration on submanifolds is the line integral over a 
curve. 
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Definition 5.8.6. Let 7: [a,b] > M be a smooth curve, and let w be a 1-form on 
M. We define the line integral of w over y as 


fo-f yw. 
Y [a,b] 


In addition, if y is a piecewise-smooth curve, we define 


where [c;-1, c;], with i = 1,...,k, are the smooth arcs of ¥. 


At the beginning of this section, we proposed to find a definition of integration 
that generalizes many common notions from standard calculus. We explain now 
how the above two definitions generalize the concepts of line integrals in R” and 
integrals of vector fields over surfaces. 

Consider first the situation of line integrals in R?. (The case for R” is identical 
in form.) In vector calculus (see [55, Definition 17.2.13]), one considers a continuous 
vector field F : R? — R® defined over a smooth curve ¥ : [a,b] + R°. Then the line 


integral is defined as 
b 
[F dF = i F(F(t)) - 7 (t) dt. 
xy a 


To connect the classical line integral to the line integral in our present formulation, 
set w = Fi dx + Fo dy + F3dz, where F' = (F\, Fo, F3). If we write 7(t) = F(t) = 
(7'(t), 7° (4), 7°(d)), then 


yw = Fy(r(t))d(") + Fart) a(n?) + Pa(a(t)) dC?) 
= (HOMO + ROW)?! 0 + BOW)? ) ae 
= FH) -7' at. 


Thus, we have shown that the classical and modern line integrals are equal via 


b 
fo-f yun f Far 
* a 7 


Second, consider the situation for surface integrals. In vector calculus (see [55, 
Definitions 17.7.8 and 17.7.9]), one considers a continuous vector field F : R? > R? 
defined over an oriented surface S parametrized by 7 : D — R°, where D is a 
compact region in R?. If (u,v) are the variables used in D, then 


[fF-43= [[ Few (Fx Fy) dA. 
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To demonstrate the connection with the modern formulation, if we write F = 
(F,, Fo, F3), then set 
w= Fyn’ + Fon? + Pan’, 


where 7 are the 2-forms described in Equation (5.24). Set also f(u,v) = F(u,v), 
and write f = (f', f?, f°) as component functions in R?. Then 


frw = Fi(f(u,v)) fn! + Fo(f(u,e)) f*0? + Fa(f(u,0)) fin’. (5.38) 
We calculate f*n! as 


of? of? of? af? 

* 2 3) __ 2 a a Side oe 

Ff (da? A de®) = df? A df? = (Gr du+ Zo do) A (Go du+ Zao) 
 f Of of — of? af? 
=(3 Su 5 a) du de. 


Repeating similar calculations for f*7? and f*n? and putting the results in Equation 
(5.38), we arrive at 


fiw = F(F(u,v)) (Fu x %) du dv. 


Using Definition 5.8.3 for the integration on a submanifold, we conclude that 


fo [yre-Ls 


thereby showing how integration of 2-forms on a submanifold gives the classical 
surface integral. 

It is interesting to observe how the integration of forms on manifolds and on 
submanifolds of a manifold generalizes simultaneously many of the integrals that 
are studied in classic calculus, which are in turn studied for their applicability to 
science. However, the reader who has been checking off the list at the beginning of 
this section of types of integration we proposed to generalize might notice that until 
now we have not provided generalizations for path integrals [, c J ds or integrals of 
scalar functions over a surface [ 5 / dA. The reason for this is that these integrals 
involve an arclength element ds or a surface area element dA. However, given a 
smooth manifold M without any additional structure, there is no way to discuss 
distances, areas, or n-volumes on M. Riemannian manifolds, which we introduce in 
the next chapter, provide a structure that allows us to make geometric calculations 
of length and volume. In that context, one can easily define generalizations of path 
integrals and integrals of scalar functions over a surface. 

Before moving on to applications to physics, we mention a special case where 
the line integral is easy to compute. 


Theorem 5.8.7 (Fundamental Theorem for Line Integrals). Let M be a smooth 
manifold, let f : M — R be a smooth function, and let 7 : [a,b] > M be a piecewise- 
smooth curve on M. Then 


/ df = f(¥)) — f(@)). 
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Proof. By Proposition 5.5.6(3), y*(df) = d(y*f), so we have 


V" (df) = dy" f) = d(f 07) = (for) dt, 


where t is the variable on the manifold [a,b]. Thus, 


b 
i df = ix 4*(df) = | (Foy) at = f(r®) — F(a). 


5.8.1 Conservative Vector Fields 


We now wish to look at a central topic from elementary physics through the lens of 
our theory of integration on a manifold. 

In elementary physics, one of the first areas studied is the dynamics of a particle 
under the action of a force. We remind the reader of some basic facts from physics. 
Suppose a particle of constant mass m is acted upon by a force F (which may 
depend on time and space) and follows a trajectory parametrized by r(t). Writing 
Uv =7" for the velocity and v = ||«|| for the speed of the particle, we define the kinetic 
energy by T = $mv?. Furthermore, since m is constant for a particle, according to 
Newton’s law of motion, F = mi". Finally, as the particle travels for t) < t < ta, 
we define the work done by F as the line integral 


to 
w= F . dr. 
t 


1 


The kinetic energy depends on time and we have 


aT dsl dv z 
= yl grt- 8) ama t= Fa. (5.39) 


Thus, as a particle moves along r(t) for t; <t < tz, the change in kinetic energy is 


te 
Ty-T, =T(e)-T(n) = [ Becat= f B-ar=w. (5.40) 
t Y 


1 


where the last integral is a line integral over 7, the curve traced out by the trajectory 
r(t) of the particle. Thus, the change in kinetic energy is equal to the work done 
by the external forces. This result is often called the Energy Theorem. A force is 
called conservative if it does not depend on time and, if, as a particle travels over 
any closed, piecewise, smooth curve the kinetic energy does not change. 

Though in physics one simply speaks of vectors or vector fields, from the per- 
spective of manifold theory, certain objects may be vectors or covectors, depending 
on their use or their transformational properties under coordinate systems. Some 
objects viewed as vector fields in classical physics should even be understood as a 
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Figure 5.10: Two paths between p; and po. 


2-form; this possible confusion arises from the fact that over a smooth manifold 
3-manifold, for each p € M, \'T,M*, A?T,M*, and T,M are all isomorphic as 
vector spaces. 

Because of how it appears in the Energy Theorem (5.40), a force field should be 
viewed as a 1-form w defined in R?. Then the Energy Theorem for the trajectory 
of a particle can be written as 


i w = T(4(t2)) — T(t). 


At any given point p along the trajectory of the particle, the velocity of the particle 
is a tangent vector v € T,R*. Then the instantaneous change of energy in (5.39) is 
simply the contraction w(v)p. 


Definition 5.8.8. A 1-form (covector field) on a smooth manifold M is called 
conservative if 
i. w =0 
~ 


for all closed, piecewise-smooth curves y on M. 


This definition has a different and perhaps more useful characterization. If 71 
and 72 are two piecewise-smooth paths from points p to p2, then the path 7 e(—72) 
defined by first traveling from p; to pz along yy, and then traveling backwards from 
p2 to p, along 7. is a closed, piecewise-smooth curve (see Figure 5.10). It is not 
hard to show that for any 1-form w, 


i w= f w+ f w= fw-f w. 
y10(—72) V1 (-72) V1 2 


Hence, a covector field w is conservative if and only if the integral of w between any 
two points p; and pg is independent of the path between them. 

A smooth 1-form has another alternative characterization, whose proof we leave 
as an exercise for the reader. 
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Theorem 5.8.9. Let M be a smooth, oriented manifold. A 1-form w € 0'(M) is 
conservative if and only if w is exact. 


Returning to physics in Euclidean R?, according to the Energy Theorem from 
Equation (5.40), a force is conservative if and only if the work done over a piecewise- 
smooth path between any two points p,; and pg is independent of the path chosen. 
Thus, if F is conservative, one defines the potential energy by 


(zyz) | 
V(ey.2)=- [ FF. dr. 
( 


£0,Yo,Z0) 


where (Zo, Yo, 20) is any fixed point. Obviously, the potential energy of F isa 
function that is well-defined only up to a constant that corresponds to the selected 
origin point (29, yo, 20). It is easy to check that 


F=-vVV. 


For a conservative force F' with potential energy V, the work of F as the particle 
travels along r(t) for t € [t,, tg] is 


w= fo F-ar=—- | "Sidr —(V(F(t2)) — V(F(t1))) = —(Va — Vi). 


Hence, the Energy Theorem can be rewritten as 
Ti, + Vi, = To + V2. 


The sum 7'+ V of kinetic and potential energy is often referred to simply as the 
energy or total energy of a particle. This justifies the terminology “conservative”: 
the total energy of a particle moving under the action of a conservative force is 
conserved along any path. 

As further examples of applications of manifold theory to physics, Problems 5.8.9 
through 5.8.11 discuss conservative properties and calculations of flux across sur- 
faces for inverse square forces. 


PROBLEMS 
5.8.1. Let y be the curve in R* parametrized by y(t) = (1 + ¢?, 2¢ — 1,¢3 — 4t 


5.8.2. Calculate the line integral i w, where 7 is the triangle in R® with vertices (0, 1, 2), 
(1,2, 4) and (—3,4, —2) and where w is the 1-form given in R® by 
w = (Qrey + 1) dx + 3x dy + yzdz. 


5.8.3. Evaluate Fey w, where M is the portion of paraboloid in R® given by z = 9—2?—y” 
above the xy-plane and where w is the 2-form w = y* drAdy+z? drAdz+2dyAdz. 
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5.8.4. 


5.8.5. 


5.8.6. 


5.8.7. 


5.8.8. 
5.8.9. 


Let T? be the torus embedded in R* that is given by the equations x? + y? = 
2? 4+ w? =1. Note that the flat torus can be parametrized by 


F (u,v) = (cos u, sin u, cos v, sin v) 
for appropriate u and v. Compute the integral Sve w, where w is the 2-form in R* 
given by 
(a) w= a3 dy A du; 
(b) w= a%zdy A du; 
(c) w = (a?yz +1) da A dz +e"yzdy A dw. 
Consider the unit sphere M = S? embedded in R?. Let 


= 2? da \dy+adxAdz+aydy Adz 
x2 + y? + 22 


be a 2-form pulled back to S?. Calculate directly the integral Sur w using: 


(a) the latitude-longitude parametrization; 


(b) the stereographic parametrizations {7n,7s5} defined in Problem 3.2.5 and 
Example 3.7.3. [Hint: Use two coordinate patches.] 


Consider the 3-torus described in Problem 5.2.4. Calculate Sur w, where 


1+sinw 


(a) w= ( cos” u+ 5 ) du \ dv A dw given in local coordinates; 


+ cos u 
(b) w= a! dx! A dx? A dx? + x? dx? \ dx® A dx’ in coordinates in R*. 


Let T? = S' x S! be the 2-torus where we use a pair of angles (0,y) € (0,2z)? 
as one of the coordinate charts and complete it in the natural manner to cover 
the whole torus. Show that w expressed as 3 cos” 6 sin y dO + (2 + cos” y) dy over 
the given coordinate chart extends to a 1-form over the whole torus. Consider 
the curve C' on T? given as a submanifold y : [0,27] — T? expressed over this 
coordinate chart as (0, y) = y(t) = (2t, 3t). Calculate the integral A Ww. 


Prove part 1 of Proposition 5.7.12. 
The force exerted by an electric charge placed at the origin on a charged particle 


is given by the force field F(7*) = K?/||F/|°, where K is a constant and 7 = (z, y, z) 
is the position vector of the charged particle. Write this force field as the covector 


Ka act Ky Gia Kz d 
(x? + y? + 22)3/2 a (a? + y? + 22)3/2 y (a? + y? + 22)3/2 e 


over the manifold R°. 


(a) Calculate the work exerted by the force on a charged particle that travels 
along the straight line from (3, —1,2) to (4,5, —1). 


(b) Prove that F is a conservative force, i.e., that w is a conservative covector 
field. 


(c) Prove that w = df where f(x,y, z) = —K/r = —K/(2? +y? + 27)'/?. 
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5.8.10. This exercise continues Problem 5.8.9. Consider the sphere of radius R and center 
0 as an embedded submanifold f : S*? > R®. Prove that 


/ xw = —4r Kk. 
f(S?) 


[Hint: Use the longitude-latitude coordinate system with (u,v) € [0,27] x [0,7]. 


5.8.11. Let T? be the 2-torus embedded in R*, using the function f : T? > R? given in 
Example 5.8.4. Show by direct calculation that 


/ kw = 0 
f(T?) 


where w is the 2-form described in Problem 5.8.9. 


5.8.12. Let M be a smooth, oriented manifold. Referring to Example 5.8.9, prove that a 
smooth (co)vector field w on M is conservative if and only if w is exact. 


5.9 Stokes’ Theorem 


In the last section of this chapter, we present Stokes’ Theorem, a central result in 
the theory of integration on manifolds. 

In multivariable calculus, one encounters a theorem by the same name. What 
is called Stokes’ Theorem for vector fields in R® states that if S is an oriented, 
piecewise-smooth surface that is bounded by a simple, closed, piecewise-smooth 
curve C, then for any C! vector field F defined over an open region that contains 


S, 
[Baa [oxi a. 
c Ss 
(See [55, Section 17.8].) 


It is a striking result that the generalization of this theorem to the context of 
manifolds simultaneously subsumes the Fundamental Theorem of Integral Calculus, 
Green’s Theorem, the classical Stokes’ Theorem, and the Divergence Theorem. 

Before giving the theorem, we state a convention for what it means to inte- 
grate a 0-form on an oriented, zero-dimensional manifold. If N is an oriented zero- 
dimensional manifold, then N = {p 1, po,...,p-} is a discrete set of points equipped 
with an association of signs s; = +1 for each i = 1,...,c. Then by convention for 
any 0-form f (i.e., a function on N), 


ie i do sif (Pi), (5.41) 
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Theorem 5.9.1 (Stokes’ Theorem). Let M” be a piecewise-smooth, oriented man- 
ifold with or without boundary, and let w be an (n — 1)-form that is compactly sup- 
ported on M. Equipping OM with the induced orientation, the following integrals 


are equal 
| dw = | W, (5.42) 
M aM 


where on the right side we take w to mean the restriction of w toOM. If0M = 0, 
we understand the right side as 0. 


Proof. We first treat the case where n > 1. Furthermore, we first prove Stokes’ 
Theorem when M is a smooth manifold. 


Suppose first that w is compactly supported in a single coordinate chart (U, ¢). 
Then by the definition of integration and by Proposition 5.5.6, 


[w= f ory =f aos). 


Using the (n — 1)-forms 7/ defined in Equation (5.24) as a basis for Q”"(R")), write 
(¢-*)*w = 00 _, win’. Then, for the exterior differential, we have 


n 


d((¢~")*w) = 5° (>: jas’ Ani = iS 


j=l 


“| dz A--- Adz”, 
Ox* 


where the second equality follows from Equation (5.25). 


Since w is compactly supported in U, then for large enough R, the component 


functions w;(x',...,2”) vanish identically outside the parallelepiped 
Dr = |-R, R] x --- x [—R, R] x (0, RB). 
ae” 
n-1 
Therefore, we remark that for all 7 =1,...,n— 1, we have 
R . Be a: 
La [wi(a)] 7", = 0. (5.43) 


ay + Ox? 
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Consequently, we deduce that 


fw-f ff (> a 
Ox? 
sais OP 1 Pe 
-rff, an | eget eet 
R = Ps 
“ff | eae Hip) oondgh is: de® 
4=1 70 Y-R -R\J—R Or 
Ow 
n dz) ---dr"1 
+f - 7 {l aan ) x x 


={ ft Wn (x [ee =F del... da?) by Equation (5.43) 


“if of Wala cage) OV da >< de"—*, (5.44) 
—R —R 


where the last equality holds because w,(x!,...,2"~!,R) = 0. Note that if the 
support of w does not meet the boundary 0M, then w,(z',...,2"~1,0) is also 
identically 0 and f,,w = 0. 

To understand the right-hand side of Equation (5.42), let i: 0M — M be the 
embedding of the boundary into M. The restriction of w to 0M is i*(w). Further- 
more, in coordinates in (U,¢), i*(dx*) = dx* if k =1,...,n —1 and i*(dx”) = 0. 
Hence, i*(7’) = 0 for all 7 4 n. Thus, in coordinates, 


i*(w) = Wn (a*, +++ "7, 0)n” 
= (-1)"" wpn(zt,--- ,2" 1,0) da’ A--- Ada". 


However, by Definition 3.7.8 for the orientation induced on the boundary of a man- 
ifold, we have 


| a(x) da’ A---Adxz"—! = 1" [ a(x) dx’ -+-dx""} 
aM Rr-1 


for the (n — 1)-form adz! \---Adzx"~'. Thus 


=| (—1)""*wy(z",- ++ 21,0) da A---Ada™! 
Drn{x"=0} 


--[ of Snot cist 0) dae ada, 
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which, by (5.44), is equal to [,,dw. This proves (5.42) for the case when w is 
supported in a compact subset of a single coordinate patch. 

Suppose now that w is supported over a compact subset K of M that is not 
necessarily a subset of any particular coordinate patch in the atlas A = {(Ua, da)} 
for M. Then we use a partition of unity {W} that is subordinate to A. Since K is 
compact, we can cover it with a finite collection of coordinate patches {(U;i, ¢:)}*_,. 


Then 
k k 
(as tie Df ale) 


by application of Stokes’ Theorem for each form y;w that is supported over a com- 
pact set in the coordinate patch U;. But d(wjw) = dy; Aw + vidw, so 


k i : 
Jota dof (atin # vee) =D fener yD fide 
k 


= f,2(do%) Naty [vide 


1=1 
k k 
= d(1)Aw+ [ide = 04 [rae = dw. 
ie 0) 2, M = M M 


This establishes Stokes’ Theorem for n > 1 and M a smooth manifold. 

Let n > 1 and suppose now that M is a piecewise-smooth, oriented manifold, 
consisting of smooth submanifolds M,, Mo,..., Mz. By definition of integration on 
piecewise-smooth manifolds, if w is a compactly supported (n — 1)-form, then 


[w= dost f ao = | wteot f Ww, 
M My Me aM, M2 


where the second equality follows by Stokes’ Theorem on smooth, oriented mani- 
folds. By the definition of an orientation on an oriented, piecewise-smooth man- 
ifold, if M; and M; share a boundary component, then these components have 
induced opposite orientation. Consequently, the boundary components in the set 
{OM,,...,0M¢} which do not cancel out precisely form the components of the 
boundary 0M. Hence, we again recover 


i aw = | Ww. 
M OM 


Now consider the case of a 1-manifold M. The boundary 0M is a 0-dimensional 
manifold. The 0-form w is simply a real-valued function on M. For a compact set 
kK contained in a coordinate system ¢: U > R + on M, the intersection KN OM is 
either empty or consists of a single point {p;}. Thus, with the assumption that w 
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is supported over a compact set contained in a coordinate patch of M/, we conclude 
that 


df = f(pi)si 
M 


by the usual Fundamental Theorem of Integral Calculus. By the convention in 
(5.41) for integration on a zero-dimensional manifold, we also have f,,, f = f(pi)si- 
Utilizing a partition of unity when w is not assumed to be supported in a single 
coordinate patch, one also immediately recovers Stokes’ Theorem. 


Two cases of Stokes’ Theorem occur frequently enough to warrant special em- 
phasis. The proofs are implicit in the above proof of Stokes’ Theorem. 


Corollary 5.9.2. If M is a smooth manifold without boundary and w is a smooth 


(n — 1)-form, then 
i dw = 0. 
M 


Corollary 5.9.3. If M is a smooth manifold with or without boundary and w is a 
smooth (n — 1)-form that is closed (i.e., dw = 0), then 


i w=0. 
OM 


The convention for integrating 0-forms on a zero-dimensional manifold allows 
Stokes’ Theorem to directly generalize the Fundamental Theorem of Calculus in 
the following way. Consider the interval [a,b] as a one-dimensional manifold M 
with boundary with orientation of displacement from a to b. Then 0M = {a,b} 
with an orientation of —1 for a and +1 for b. A 0-form on M is a smooth function 
f : [a,b] > R. Then Theorem 5.9.1 simply states that 


i. i [ f'(x) dx = f(b) — f(a), 


which is precisely the Fundamental Theorem of Calculus. 

The reader might remark that, as stated, Stokes’ Theorem on manifolds only 
generalizes the Fundamental Theorem of Calculus (FTC) when f is a smooth func- 
tion, whereas most calculus texts only presuppose that f is C! over [a,b]. The 
history behind the FTC is long and we encourage the reader to consult [13] for 
an excellent historical account of the work on defining integrals properly. Since we 
restricted our attention to smooth manifolds and smooth functions, these technical 
details are moot. 


PROBLEMS 


5.9.1. Explicitly show how Stokes’ Theorem on manifolds generalizes Stokes’ Theorem 
and the Divergence Theorem from standard multivariable calculus. 
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5.9.2. 


5.9.3. 


5.9.4. 


5.9.5. 


5.9.6. 


5.9.7. 


5.9.8. 


5.9.9. 


Use Stokes’ Theorem to evaluate f g dw, where S is the image in R? of the parametriza- 
tion 
r(u,v) = (1 — v)(cos u, sin u, sin 2u, 0) + v(2, cos u, sin u, sin 2u) 

and where w = 2? dx’ + 2° dx? + x* dx® — x' dx’. 
Let M be the hypercube in R* consisting of the 16 vertices (+1,+1,+1,+1). This 
isa manifold with boundary embedded in R*. Let w = xdy A de A a + (3sin(y + 

z)+e" *) da A dz \ dw be a 3-form in R?, which we consider as a 3-form on the 
ener of the hypercube 0M. Use Stokes’ Theorem to calculate am 
Let B* = {x € R*| ||z|| < 1} be the unit ball in R*, and note that 0B* = S*. We 
use the coordinates (x,y,z, w) in R* and hence in B*. Use Stokes’ Theorem to 
evaluate 


/ (e*” cos w dx A dy \ dz + a? zdx A dy \ dw). 
s3 


[Hint: After applying Stokes’ Theorem, consider symmetry across the w = 0 plane, 
then use a combination of Cartesian and spherical coordinates integration] 

Let M be a compact, oriented, n-manifold, and let w € 07(M) and yn € *(M), 
where j +k = n—1. Suppose that 7 vanishes on the boundary 0M or that OM = 9. 


Show that 
/ wAdn= ay f dw \ . 
M M 


Let M be a compact, oriented n-manifold. Let w and 7 be forms of type j and k 
respectively, such that 7 +k =n — 2. Show that 


[ aonan= f w A dn. 
M aM 


Explain how this generalizes the well-known result in multivariable calculus that 


[ava -ar= [[oorxva)-48 


where S$ is a regular surface in R? with boundary C and where f adn g are real- 
valued functions that are defined and have continuous second derivatives over an 
open set containing S. 

Integration by Parts on a Curve. Let M be acompact and connected one-dimensional 
smooth manifold. Let f,g : M — R be two smooth functions on the curve M. Show 
that 0M consists of two discrete points {p,q}. Suppose that M is oriented so that 
the orientation induced on OM is —1 for p and +1 for qg. Show that 


a f dg = f(a)g(a) — f(p)a(p) - [ 9a 


Let M be an embedded submanifold of R” of dimension n — 1. Suppose that M 


encloses a compact region R. Setting w = +(>7"_, 2'n'), where the 7’ are defined 


by Equation (5.24), show that the n-volume of R is [,,w 
Consider M = R” — {(0,...,0)} as a submanifold of R”, and let S”~1 be the unit 
sphere in R” centered at the origin. 

(a) Show that if w € 2"~'(M) is exact, then f,,,_, w =0. 

(b) Find an example of a closed form w € 2”~'(M) such that f,,,_; w 4 0. 


CHAPTER 6 


Introduction to Riemannian Geometry 


To recapitulate what we have done in the past two chapters, manifolds are topo- 
logical spaces that are locally homeomorphic to a Euclidean space R” in which one 
could do calculus. Chapter 5 introduced the analysis on manifolds by connecting 
it to analysis on R” via coordinate charts. However, the astute reader might have 
noticed that our presentation of analysis on manifolds so far has not recovered one 
of the foundational aspects of Euclidean calculus: the concept of distance. And 
related to the concept of distance are angles, areas, volumes, curvature... 

In the local theory of regular surfaces S in R®, the first fundamental form (see 
Example 5.2.3) allows one to calculate the length of curves on a surface, the angle 
between two intersecting curves, and the area of a compact set on S' (see [5, Section 
6.1] for details). This should not be surprising: we defined the first fundamental 
form on S as the restriction of the usual Euclidean dot product in R® to the tangent 
space T,(.S) for any given point p € S, and the dot product is the basis for measures 
of distances and angles in R®. 

In general, manifolds are not given as topological subspaces of R” so one does 
not immediately have a first fundamental form as we defined in Example 5.2.3. 
Furthermore, from the definition of a differentiable manifold, it is not at all obvious 
that it has a metric (though we will see in Proposition 6.1.8 that every smooth 
manifold has a metric structure). Consequently, one must equip a manifold with 
a metric structure, which we will call a “Riemannian structure.” Applications of 
manifolds to geometry and curved space in physics will require this additional metric 
structure. 

As in many mathematics texts, our treatment of manifolds and Riemannian 
metrics does not emphasize how long it took these ideas to develop nor have the 
previous two chapters followed the historical trajectory of the subject. After the 
discovery of non-Euclidean geometries (see [11] for a good historical discussion), by 
using only an intuitive notion of a manifold, it was Riemann [47, Section II] in 1854 
who first proposed the idea of a metric that varied at each point of a manifold. 
During the following 50 or more years, many mathematicians (Codazzi, Beltrami, 
Ricci-Curbastro, Levi-Civita, Klein to name a few) developed the theories of cur- 
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vature and of geodesy for Riemann spaces. However, the concept of a differential 
manifold as presented in Chapter 3 did not appear until 1913 in the work of H. Weyl 
(59, 1.84]. According to Steenrod [53, p. v], general definitions for fiber bundles and 
vector bundles, which we introduced in part in Chapter 5, did not appear until the 
work of Whitney in 1935-1940. 

Turning to physics, general relativity, one of the landmark achievements in sci- 
ence of the early 20th century, stands as the most visible application of Riemann 
manifolds to science. Starting from the principle that the speed of light in a vac- 
uum is constant regardless of reference frame [20, p. 42], Einstein developed the 
theory of special relativity, defined in the absence of gravity. The “interpretation” 
of the law that “the gravitational mass of a body is equal to its inertial mass” [20, 
p. 65] and the intention to preserve the principle of the constancy of the speed of 
light led Einstein to understand spacetime as a curved space where “the geomet- 
rical properties of space are not independent, but [...] determined by matter” [20, 
p. 113]. Riemannian metrics, curvature, and the associated theorems for geodesics 
gave Einstein precisely the mathematical tools he needed to express his conception 
of a curvilinear spacetime. 

The reader should be aware that other applications of manifolds to science do 
not (and should not) always require a metric structure. Applications of manifolds 
to either geometry or physics may require a different structure from or additional 
structure to a Riemann metric. For example, in its properly generalized context, 
Hamiltonian mechanics require the structure of what is called a symplectic manifold. 


6.1 Riemannian Metrics 


6.1.1 Definitions and Examples 


Definition 6.1.1. Let M be a smooth manifold. A Riemannian metric on M isa 
tensor field g in Sym” TM* that is positive definite. In more detail, at each point 
p © M, a Riemannian metric determines an inner product on T,M. A smooth 
manifold M together with a Riemannian metric g is called a Riemannian manifold 
and is denoted by the pair (M,g). 


Over a coordinate patch of M with coordinate system (x!,...,2"), as a section 
of TM*®?, one writes the metric g as 


95 dx’ @ dx), 


where gi; are smooth functions on M. (In this chapter, we regularly use Einstein’s 
summation convention.) Since at each point, g is a symmetric tensor, gi; = 9): 
identically. Furthermore, using the notation from Section 4.6, since g is a section 
of Sym? TM*, we write 

giz dx" dei. (6.1) 


The square root of the expression in Equation (6.1) is called the line element ds 
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associated to this metric. Many texts, in particular, physics texts, give the metric 
in reference to the line element by writing ds? = g;; dx’ dx’. 

For vectors X,Y € TM, we sometimes use the same notation as the first 
fundamental form ([6, Section 6.1]) and write (X,Y), for g,(X,Y), and it is also 
common to drop the subscript p whenever the point p is implied by context. By 
analogy with the dot product, the Riemannian metric allows one to define many 
common notions in geometry. 


Definition 6.1.2. Let (M,g) be a Riemannian manifold. Suppose that X,Y are 
vectors in T,,M. 


1. The length of X, denoted ||X'||, is defined by ||X|| = /g(X, X). 
2. The angle 6 between X and Y is defined by 


g(X,Y) 
cos? = ~~, 
|X|] YI 


3. X and Y are called orthogonal if g(X,Y) = 0. 


Whenever one introduces a new mathematical structure, one must discuss func- 
tions between any two instances of them and when two structures are considered 
equivalent. In the context of Riemannian manifolds, one still studies any smooth 
functions between two manifolds. However, two Riemannian manifolds are consid- 
ered the same if they have the same metric. The following definition makes this 
precise. 


Definition 6.1.3. Let M and N be two Riemannian manifolds. A diffeomorphism 
f:M—N is called an isometry if for all p € M, 


(X,Y) p = (p(X), Go(Y)) pp) ‘for all X,Y € T)M. 


Two Riemannian manifolds are called isometric if there exists an isometry between 
them. 


From an intuitive perspective, an isometry is a transformation that bends (which 
also includes rigid motions) one manifold into another without stretching or cut- 
ting. Problem 6.1.6 asks the reader to show that the catenoid and the helicoid are 
isometric. Figure 6.1 shows intermediate stages of bending the catenoid into the 
helicoid. Though one might think this transformation incorporates some stretching 
because the longitudinal lines straighten out, the twist created in the helicoid strip 
“balances out” the flattening of the lines in just the right way so that one only 
needs to bend the surface. 

Many examples of Riemannian metrics arise naturally as submanifolds of Rie- 
mannian manifolds. 


Definition 6.1.4. Let (N,g) be a Riemannian manifold and M any smooth mani- 
fold. Let f : WM — N be an immersion of M into N, i.e., f is differentiable and df, 
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Figure 6.1: Bending the catenoid into the helicoid. 


is injective for all p. The metric g on M induced by f (or “from N”) is defined as 
the pull-back g = f*g. In other words, 


(X,Y) = (dfp(X),Gfp(¥)) fp) for all p€ M and X,Y € TyM. 


The property that df, is injective ensures that (, ), remains positive definite 
when induced on M. 


Example 6.1.5 (Euclidean Spaces). Consider the manifold M = R”, where 

T,(R”) = R” is naturally equipped with a Riemannian metric: the usual dot prod- 
uct. In particular, 

1, ift=j 

O;,0;) = 6); = 4” , 

9 (01105) = O45 { fi dj. 


This metric is called the Euclidean metric. 


Example 6.1.6 (First Fundamental Form). A regular surface S is a 2-manifold 
embedded in R?, where the embedding map is simply the injection 7: S + R°. The 
first fundamental form (see Example 5.2.3) is precisely the metric on S induced by 
i from the Euclidean metric on R°. This connection gives us immediately a whole 
host of examples of Riemannian 2-manifolds that we take from the local theory of 
regular surfaces. 


Proposition 6.1.7. Let M be an m-dimensional manifold embedded in R”. If 
F(ui,...,Um) ts a parametrization of a coordinate patch of M, then over this co- 
ordinate patch, the coefficients of the metric g on M induced from R” are 


OF OF 
95 = Sal Bas 
Proof. Let (x',...,x2”) be the coordinates on R". Suppose that a coordinate patch 
(U, ¢) of M has coordinates (u1,...,Um) and that a parametrization of this coordi- 


nate patch is F(u1, 63Um) = @ *(u1,...,Um). By Equation (3.14) the matrix of 
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dF in the given coordinate systems is 
OF’ 
Oud J? 
where F = (F!,...,F). 


Set (, ) as the usual dot product in R”. Then at each point in U, the coefficients 
gre of the metric g satisfy 


a a a ae ee OF‘ OF; 9 A 
Gre = 9 (sara) = (iF (aur) a (s)) ~ Ouk dul (Bat ae 


_ OF’ OFI OF OF 


~ Ouk dul ~ Buk Aue’ 
We should emphasize at this point that a given manifold can be equipped with 
nonisometric Riemannian metrics. Problem 6.1.2 presents two different metrics on 
the 3-torus, each depending on a different embedding into some Euclidean space. 
In both cases, the 3-torus can be equipped with the same atlas, and so in both 
situations, the 3-torus is the same as a smooth manifold. 
As another example, already in his seminal dissertation [47], Riemann introduced 
the following metric on the open unit ball in R”: 


4 


Gui = (Ja)? and Gig = Oifz x j. (6.2) 


As we will see, this is not isometric with the open unit ball equipped with the 
Euclidean metric. 

Example 6.1.5 could be misleading in its simplicity. The reader might consider 
the possibility of defining a metric on any smooth manifold M by taking (, )» as 
the usual dot product in each TM with respect to the coordinate basis associated 
to a particular coordinate system. The problem with this idea is that it does not 
define a smooth section in Sym? TM* over the whole manifold. Nonetheless, as the 
proof of the following proposition shows, we can use a partition of unity and stitch 
these bilinear forms together. 


Proposition 6.1.8. Every smooth manifold M has a Riemannian metric. 


Proof. Let M be a smooth manifold with atlas A = {(Ua,¢a)}aer. For each a € I, 
label (, )* as the usual dot product with respect to the coordinate basis over Uq. 
Let {w.} be a partition of unity that is subordinate to A. For each p € M, define 
the bilinear form (, ), on T,M by 


(XV )p So balp(X,¥)2 (6.3) 


ael 


for any X,Y € T,M. 
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Since for each p € M, only a finite number of a € I have wWa(p) 4 0, then 
the sum in Equation (6.3) is finite. It is obvious by construction that (X,Y), is 
symmetric. To prove that (, )» is positive definite, note that each (X,Y)% is. Let 
I’ be the set of all indices a € I such that w.(p) 4 0. By definition of a partition 
of unity, 0 < Wa(p) < 1. Thus, for all X € T,M, clearly (X,X), > 0. Furthermore, 
if (X,X), = 0, then at least one summand in 


S> da (p)(X, X)8 


ael’ 


is 0. (In fact, all summands are 0.) Thus, there exists an a € I with (X,X)¥ = 
0. Since (, )> is positive definite, then X = 0. Hence, (X,X), itself is positive 
definite. 


Though Equation (6.3) presents a Riemannian metric on any smooth manifold 
M, this is not in general easy to work with for specific calculations since it uses a 
partition of unity, which involves functions that are usually complicated. At any 
given point p € M, Equation (6.3) does not involve one coordinate system around 
p but all of the atlas’s coordinate neighborhoods of p. 

More importantly, the Riemannian metric constructed in the above proof might 
not have any natural meaning. 


Example 6.1.9 (Projective Space). There is a natural metric on projective space 
RP” that is induced from the Euclidean metric on R"*+. First, let g be the metric 
on S” induced from Euclidean R” as an embedded submanifold. Recall that the 
projection map 7: S" > RP” as presented for n = 2 in Example 3.2.3 is a smooth 
function between manifolds. Define the metric g on RP” by 


Gn(p)(v,W) = Gp ((Atp)~*(v), (dtp) *(w)) , (6.4) 
for all v,w € Ty(»)RP". Note first that for all p ¢ S”, the linear transformation 
dtp is surjective between spaces of the same dimension, so is invertible. More 
importantly, g is well-defined: If A : S” > S” is the antipodal map A(p) = —p, 
then 70 A= 7. Hence, dt, = d(m0 A), = da_podA,. Hence 

Gp (dtp) *(v), (dtp) *(w)) 

= Jp ((dAp)"((dt_p)~*(v)), (4Ap)* ((d7_p)“*(w))) 

= 9-p((da_p)*(v), (dtp) *(w)), 
where the second equality follows because A : S” > S” is an isometry. Consequently, 
in (6.4), the choice of p or —p for the pre-image of 7(p) in RP” is irrelevant. 


6.1.2 Arclength and Volume 


Using integration, the Riemannian metric allows for formulas that measure nonlocal 
properties, such as length of a curve and volume of a region on a manifold. 
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For example, consider a C! curve y : [a,b] — M on a Riemannian manifold 
(M,g). At each point y(t) in M, the vector y'(t) = a is a tangent vector, called 
the velocity vector. The Riemannian metric g = (, ) allows one to calculate the 


length ||y‘(t)||, which we call the speed. This motivates the following definition. 


Definition 6.1.10. Let y: [a,b] > M be a curve on a Riemannian manifold M of 
class C!. The arclength of the curve ¥ is 


dy dy 
dt’ dt / (t) 


Proposition 6.1.11. Let (M,g) be an oriented Riemannian manifold of dimension 
n. There exists a unique n-form, denoted dV, such that at all p € M, it satisfies 
dV,(€1,---,€n) = 1 for all bases (e€1,...,€n) in T,M that are orthonormal with 
respect to gp(, ). Furthermore, over any coordinate patch U with coordinates x = 


ieee Is 
dV = 4/det(gij) dx’ A--- Adz”, (6.5) 


where gi; = 9(0;,0;) = (0/Oz', 0/Ox!). 


Proof. The content of this proposition is primarily linear algebra. By Proposition 
C.2.1, on each coordinate patch U,, the form dV|y,, = wa exists on each T,M and 
is given by Equation (6.5). The existence of this n-form w with the desired property 
explicitly requires that g, be an inner product on T,M. In order to define the form 
dV on the whole manifold, we refer to a partition of unity {W.} subordinate to the 
atlas on M and define 
AV => Hawa: 
a 


Definition 6.1.12. The form dV described in Proposition 6.1.11 is called the vol- 
ume form of (M,g) and is denoted dV, if there is a chance of confusion about the 
manifold. If M™ is a compact manifold, then the m-volume of M is the integral 


Vol(M) = i: dV. (6.6) 
M 


Ifi: M™ + N” is an embedded submanifold of a Riemannian manifold (N, g) 
then we can also calculate the m-volume of the submanifold M by equipping M with 
the metric g = 1*g, i.e., the metric induced from N. The reader should be aware 
that the volume form on M is not necessarily i*(dVy). In particular, ifm < n, then 
i*(dVn) would be an n-form on M, but there are no n-forms on M. We illustrate 
this with a few examples. 


Example 6.1.13 (Arclength). Definition 6.1.10 should actually be a corollary of 
Proposition 6.1.11 and Definition 6.1.12. Let (MM, g) be a Riemannian manifold and 
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let y : [a,b] > M be acurve of class C! on M. The induced metric g on ¥ is defined 
by 

(v,v) = dp (2 

v,w) = —v,—w). 

HY, Gy(t) dt’ dt 
However, since a curve is one dimensional, both tangent vectors v and w are scalars. 
So 
g(v,w) = G(r) ((dy/dt), (dy/dt) ow. 

In the coordinate t on [a,b], the domain of y, the metric tensor can be represented 


by a 1 x 1 matrix (9,()((dy/dt), (dy/dt))). Thus, the volume form on 7 from the 
metric induced from M, is 


dV = ¥/ G00) ((dy/at), (dy/at) at. 
Definition 6.1.10 follows as a corollary. 
Example 6.1.14 (Volume form on S”). Consider the unit n-sphere S” as a Rie- 


mannian manifold, equipped with the metric induced from its usual embedding in 
Rt, 


Consider the usual longitude-latitude parametrization of the sphere S?: 
X(u,v) = (cosusinv,sinusinv,cosv) for (u,v) € [0,2x] x [0,7]. 


Note that if we restrict the domain to (0, 277) x (0,7), we obtain a dense open subset 
of S?. By Proposition 6.1.7, with respect to this coordinate system, the coefficients 


of the metric tensor are 
_ [sin?v 0 
D8 Nh Ay 


Since sinv > 0 for v € [0,7], the volume form on S? with respect to this 
coordinate system is dV = sinudu A dv. By Proposition 5.8.1, the volume of the 
sphere is calculated by 


Tw 27 
V= av = f | sin dudv = 2n[ —cosv]) = 4m. 
S2 o Jo 


We now calculate the volume form on S” using an alternate approach. By 
Example 4.6.24, we see that the volume form on R”*! is 


eh A A ett), 


where {e1,...,€n41} is the standard basis on R"*!. Furthermore, recall that as an 
alternating function, 


etl Kaeo hen Gy aaa) = det (v1 ee Un+1) 


for any (n+ 1)-tuple of vectors (v1,...,Un4i)- 
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v2 


Figure 6.2: Volume form on the sphere. 


Define a form w € 2”"(IR"*!), where for each x € R"*+ and for any vectors u; 
We(U1,.--,Un) = det (x Uyocee the) 3 


This is in fact, the same construction as the inner product on forms i,w, but where 
we are taking advantage of the identification of R"*! with its own tangent space. 
By the properties of the determinant, for each x, wz is an alternating n-multilinear 
function on R"+!, so w is indeed an n-form. Using the Laplace expansion of the 
determinant, it is easy to show that 


n+1 


i=l 


where the ~ notation means to exclude the bracketed term. Using the forms 7/ 
introduced in Equation (5.24), we can write w= 7,27’. 

Now if z € S”, then from the geometry of the sphere, x is perpendicular to T;,S” 
as a subspace of R"*! (equipped with the Euclidean metric). Thus, if {v1,..., Un} 


forms a basis of T,S", then {,v1,...,Un} forms a basis of R"t!. See Figure 6.2. 
Furthermore, if the n-tuple (v1,..., Un) is an orthonormal, positively-oriented basis 
of T,S", then (x,v1,...,Un) is an orthonormal, positively-oriented basis of R"*+. 


But then the restriction of w to S” has the properties described in Proposition 
6.1.11. Hence, if f : S” > R"*! is the usual embedding of the sphere in Euclidean 
space, then we obtain the volume form of S” as 


dVgn = wy, = f*(w). 


In Section 5.7, we attempted to generalize with the single technique of inte- 
gration on manifolds all the types of integration introduced in a standard calculus 
sequence. However, there were two types of integrals in the list at the beginning of 
the section that did not fit in the formalism we had developed for the integration 
of n-forms on n-dimensional smooth manifolds, namely: 
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e Line integrals of functions in R”. 


e Surface integrals of a real-valued function defined over a closed and bounded 
region of a regular surface in R®. 


Both of these types of integrals fit into the theory of integration volume forms on 
manifolds in the following ways. 

For the line integral of functions in R” over a piecewise-smooth curve C, let 
7 : [a,b] > R” be a parametrization of C. Let (,) be the Euclidean form on 
R” (ie., the dot product). Then each smooth piece of y is a one-dimensional 
submanifold of R”, equipped with the metric induced from R”. The volume form 
on ¥ is dV, so that for any smooth function f defined on a neighborhood of C, 


[ra,= [10 (FD t= [te (6.8) 


For surface integrals of a function f on a compact regular surface S C R’, it 
is not hard to show (see Problem 6.1.1) that dS = dVs, where dVs is the volume 
form on S equipped with the metric induced from the Euclidean metric. Thus, 
connecting the classical notation with the notation introduced in this section, 


[ fds = [ faVs. 6.9) 


6.1.3 Raising and Lowering Indices 


One interesting property of metrics on manifolds is that they give us a natural way 
to go back and forth between vectors and covectors. 

Recall from Section 4.1 that for any real vector space V, the dual vector space 
V* consists of all linear transformations V > R. If V is a vector space equipped 
with any bilinear form (, ), then this form defines a linear transformation into the 
dual V* by 


i:V—7V*, 
UR > ty =(wH (v,w)). 
If V is finite-dimensional and the bilinear form (, ) is nondegenerate, then the 
mapping (v +> i,) is an isomorphism. We will assume from now on that (, ) is 
an inner product. The positive-definite implies the bilinear form is nondegenerate 


so the above mapping is an isomorphism. This isomorphism allows us to define a 
natural bilinear form (, )* on V* by 


(n,7)* = (i-"(n), 67). 


Then (, )* is an element of V @ V, so is a tensor of type of (2,0). Furthermore, 
if the components of (, ) are cjx, then the components of (, )* with respect to 
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the associated basis are denoted by c’*, where these are the entries of the inverse 
matrix C~!, where C = (cjx). (See Exercises 4.2.4 and 4.2.5 for where we prove 
these results.) 

In coordinates, let B = {u,,...,Un} be a basis of V and let B* = {u*!,...,u*"} 
be the associated cobasis. Let C = (c;;,) be the matrix of (, ) with respect to B, 
viz., Cyr = (uz, Uz). Then 


(v,w) = [v]gClwls = cj,v? w*. (6.10) 


This gives the coordinates of i, with respect to B* as c;,v?. Note that the indices 
for the components of 7, arise naturally as subscripts, consistent with our notation. 
Similarly, if \ € V* is a functional on V, then \ = \;u**. If \ = i, for some v € V, 
then 

)\, = cM ajpu = cjpary) = dul =v’. (6.11) 


We say that the process of mapping v to 2, “lowers the indices,” while mapping A 
to i-!(A) “raises the indices.” 

Now consider a Riemannian manifold (M,g). If X € X(M) is a vector field on 
M, we define the covector field X° by 


def 
= g(X,Y) 


X°(Y) 
for all vector fields Y € X(M). On a coordinate patch, X has coordinates X*. 
By the process described in the previous paragraphs, X’ has components Gig X . 
Mimicking musical notation, the function '(7M) > I'(T'M*) that sends X 4 X°? 
is call the flat, since it lowers the indices of the vector field X. 
As we saw, a metric g* on M also induces an inner product in Sym? T,M defined 
by g*(,7) = g(i—'(n), i7*(7)) for any n,7 € T,M™*. So if w € 2'(M) is a covector 
field on M, we define the vector field w* by 


for all covector fields 7 € Q'(M). On a given coordinate patch, w has component 
functions w; and the components of w* are g’w;. Keeping the musical analogy, we 
denote call this vector field the sharp of w since the process raises the indices. 

More generally, if T is any tensor field of type (p,q) on M, then 

tarT and gine 

define tensors fields of type (p — 1,q +1) and type (p+ 1,q — 1), respectively. It 
is common to still use the > and { notation, here T? and T?, but one must indicate 
upon which index one performs the lowering or raising operations. 

Recall that the trace of a matrix A is defined as the sum of the diagonal elements. 
If A has components A’, then the trace is just Aj, using the Einstein summation 
convention of summing along 7. Now A corresponds to a linear transformation 
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T(v) = Av on a vector space V. Since the trace Tr A is also the sum of the 
eigenvalues, the trace remains unchanged under a change in basis in V. 

Now, if A is a symmetric (0, 2)-tensor, then A? is a (1,1)-tensor, and the trace 
Tr A is defined in its usual linear algebraic sense. This process is common enough 
that we define the trace with respect to g of A to be 


def 


Th, A= teal, (6.12) 


In coordinates, Trg A = g‘) Aj;. 


PROBLEMS 


6.1.1. Recall the following formula from calculus. Let K be a compact set on a regular 
surface S in R°. Suppose that S is parametrized by F(u, v) and that under this 
parametrization, K = R(D) for some compact region D. Then the surface area 


of K on S is 
If as = [ iF x Fl] dudu. 
K D 


Consider the regular surface S as a 2-manifold embedded in the Riemannian mani- 
fold of R°, equipped with the dot product as its usual metric. The parametrization 
F(u,v) describes the embedding of S in R® in reference to the coordinates in a 
chart of S. Prove that the volume form on S of metric on S induced from the dot 
product in R® is 

dVg = ||ru x Fy|| du A dv. 


This establishes the familiar surface area integral from Definition 6.1.12. 


6.1.2. Consider the 3-torus T? = S' x S' x S!. Calculate the induced metrics for the 
following two embeddings: 


(a) Into R® as the image of the parametrization 
F(u', u’,u®) = (cos u',sinu’,cos u’, sin u”, cos u®,sinu’) : 
(b) Into R* as the image of the parametrization F(u',u?,u*) given by 


((c+ (b+ acosu') cos u”) cos.u®, (e+ (b+ acos u') cos.u’) sinu®, 


(b+ acosu') sinu’,asinu') ; 


whereb>a>Oandc>a+t+b. 


(c) Prove that these two Riemannian manifolds are not isometric. 


[Hint: This gives two different metrics on the 3-torus that can be equipped with 
the same atlas in each case.] 


6.1.3. We consider an embedding of S! x S? in R* by analogy with the embedding of the 
torus S' x S' in R*. Place the sphere S? with radius a at (0,0, b,0) (where b > a) 
as a subset of the z'x2?x*-subspace, and rotate this sphere about the origin with 

a motion parallel to the 7°x*-axis. Call this submanifold M and equip it with the 


metric induced from R*. 
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(a) Show that the described manifold M is an embedding as claimed. 


(b) Find a parametrization F : D — R* where D C R® such that, as sets of 
points F'(D) = M, and F(D°) is an open subset of M that is homeomorphic 
to D°. (D° is the interior of the set D.) 


(c) Calculate the coefficients g;; of the metric on M in the coordinate patch 


defined by the above parametrization. 


6.1.4. Let (M1, 91) and (M2, g2) be two Riemannian manifolds, and consider the product 
manifold M; x M2 with a (0,2)-tensor field defined by 


g(X1 + X2,¥1 + Yo) = gi (X1,M%) + g2(X2, Yo). 


(a) Show that g defines a metric on M, x Mz. 


(b) Let (x',...,2”) be local coordinates on M, and (2”*',...,2"*) be local 
coordinates on M2 so that (a', er et) are local coordinates on My x Mo. 
Determine the components of the metric g on M; x Mo in terms of gi and 


g2- 


6.1.5. Repeat Problem 6.1.3 with S! x S$, where S is a regular surface in R® that does 
not intersect the plane z = x? = 0. 


6.1.6. Consider the following two regular surfaces in R®. The catenoid parametrized by 


F(u', a’) = (a cosa’, a sina’, cosh~' @) for (u', a”) € [0, 2m) x [1, +00) 


and the helicoid parametrized by 
F(u',u’) = (wu? cosu',u? sinu’,u') for (u',u”) € (0,27) x R. 
Prove that the helicoid and catenoid are isometric, and find an isometry between 


them. 


6.1.7. Let M bea hypersurface of R” (submanifold of dimension n—1), and equip M with 
Riemannian structure with the metric induced from R”. Suppose that an open 
set U of M is a graph of an (n — 1)-variable function f, i.e., the parametrization 


of U is 
1 1 n-1 n—1 n fle jee 


uc =U,...,xz =U c= 
for (u',...,u"~') € D. 


(a) Find the coefficients of the metric tensor g on M, and conclude that a formula 
for the (n — 1)-volume of U is 


/ V14 |lgrad f||? dv. 
D 


(b) Use this result to calculate the 3-volume of the surface in R* given by w = 
ety? +27 fora? +y2+4+27 <4. 


6.1.8. Let M, N, and S be Riemannian manifolds, and let f: M—> N andh:N->S 
be isometries. 


(a) Show that f~' is an isometry. 
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(b) Show that ho f is an isometry. 


(c) If you have seen some group theory, show that the set of isometries on a 
Riemannian manifold M forms a group. 


6.1.9. Two metrics g and g on asmooth manifold M are called conformal if there exists 
a smooth function f € C°°(M,R) such that g = fg. Prove that for all p € M and 
for all X,Y € T,M, the angle between X and Y with respect to g is the same as 
the angle with respect to g. 


6.1.10. Let (M,g) and (N,g) be Reimannian manifolds. A diffeomorphism f : M— N 
is called a conformal mapping if f*g is conformal to g. Repeat Problem 6.1.8 but 
replacing “isometry” with conformal mapping.” (See Problem 6.1.9.) 


6.1.11. Let y be a curve on a Riemannian manifold (M,g). Show precisely how the 
induced metric on y generalizes Definition 6.1.10. 


6.1.12. Poincaré ball. The Poincaré Ball is the open ball B® in n dimensions of radius R 
equipped with the metric 


ea ((dx’)? +--+ + (de")”) . 


Note that this metric is conformal with the metric (see Problem 6.1.9) induced 
from the Euclidean metric in R"+?. 


(a) Set n = 2 and R=1. (This choice of parameters is called the unit Poincaré 
disk.) Calculate the area of the region R defined by ||x|| < 3 and0 <0 < 
m/2. 


(b) Set n = 3and R = 2. Calculate the length of the curve y(t) = (cost, sint, t/10) 
in the Poincaré ball for 0 < t < 47. 


6.1.13. Divergence Theorem. Let (M,g) be an oriented, compact, Riemannian manifold 
with boundary. Given any vector field X € X(M) and any tensor field T of type 
(p,q), with qg > 1, we define the contraction of X with T, denoted ixT, as the 
tensor field of type (p,q — 1) that over any coordinate chart has components 


xlpir ip 


Ujae5q° 

We define the divergence operator div : X(M) — C®°(M) implicitly by 
d(ix dV) = (div X) dV. 

Prove the Divergence Theorem, which states that for any X € X(M) 


ih divXdV= | g(X,N)aV, 
M OM 


where N is the outward unit normal to 0M and dV is the volume form associated 
to the metric on OM induced from M. 


6.1.14. We consider the sphere S® of radius R as a submanifold in R* with the induced 
Euclidean metric. 
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(a) Show that 
1,2 .% sl ied PR eee Lee 2. 
F(w,u ,u’) = (Reosu sinu’ sinu’, Rsinu’ sinu” sinu®, Roos u sin u®, Rcosu*), 
1 4,2 


where (u’,u?,u®) € [0,27] x [0,7]° gives a parametrization for S? that is 
homeomorphic to its image when restricted to the open set V = (0,27) x 


(0,7). 
(b) Calculate the components of the metric tensor on the coordinate patch 
F(V) =U. 


(c) Use part (b) to calculate the volume of a 3-sphere of radius R. 
(d) Leaving R unspecified, consider the function f(x',2*,a3,a4) = (a')? + 
(a?)? + (x)? and calculate the volume integral 4, fdV. (Note that this 
3 


S 
integral would give the radius of gyration of the spherical shell of radius R 
about a principal axis — if such a thing existed in R*!) 


6.1.15. Calculate the 5-volume of the 5-sphere S° of radius R as a submanifold of R°. 


6.1.16. (ODE) A loxodrome on the unit sphere S? is a curve that makes a constant angle 
with all meridian lines. We propose to study analogues of loxodromes on S*. 
Consider the unit 3-sphere S* with the parametrization from Problem 6.1.14. Set 
R= 1. We will call a loxodrome on S* any curve ¥ such that 7’ makes a constant 


angle of a2 with ae: and a constant angle of a3 with —~ 


Ou? Ou2 
(a) Find equations that the components of 7 must satisfy. 


(b) Solve the differential equations we get in part (a). [Hint: Obtain u’ and u? 
as functions of u?. You might only be able to obtain one of these functions 
implicitly.] 


6.1.17. Consider the function r : R"*! — {0} + S” given by r(x) = 2/||2\I. 


(a) Using Example 6.1.14, prove that 
G=r"(WVer) = a Dow. 


(b) Show that & is closed but not exact in R"*? — {0}. 
(c) Use dVgn to show that 


Vol(S”) = (n + 1)Vol(B"*?), 


where B”*? is the unit ball in R"+?. 


6.1.18. Let M be a Riemannian manifold, and let f : M — M be an isometry on M. 
Prove that f*(dViz) = +dVi. (The isometry f is called orientation-preserving if 
f*(dVit) = dVuz and orientation-reversing if f*(dViz) = —dV.) 
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6.1.19. Suppose that J and K are disjoint, compact, oriented, connected, smooth subman- 
ifolds of R”* whose dimensions are greater than 0 and such that dim J+dim K = 
n. Define the function UW by 


wW:Jx kK —S" 


y—2@ 
(z,y) > 7 —7. 
lly — | 
The linking number between J and K is defined as 


link(J, K) = TAS a / _¥ (avon). 


Prove Gauss’s Linking Formula for the linking number of two closed space curves: 


= 


link(C1, C2) = eh uy ae aT NO) aud, (6.13) 


[Note that a closed curve is homeomorphic to a circle S' so J x K is homeomorphic 
to a torus. Hence, we can view W as a function T? = S' x S' + S?|] 


The following exercises involve the Hodge star operation, which is introduced in Appendix C.3. 


6.1.20. Let (M,g) be an oriented Riemannian manifold. Section C.3 defines the Hodge 
star operator on inner product spaces. Given a form 7 € O*(M), at each pe M, 
the Hodge star operator defines an isomorphism * : \* T,M* > A\"~* TpM*. 


(a) Show that the Hodge star operator * is a vector bundle map A\* TM* > 
Fi esa T M* that leaves every base point fixed and that varies smoothly. 
(b) Show that for all functions f : C°°(M), the Hodge star operator is given by 
xf = fdV,. 
6.1.21. Consider R” as a manifold with the standard Euclidean metric. 
(a) Calculate xdx’ for any i=1,...,n. 
(b) Set n = 4, and calculate «(dx A dz’). 
6.1.22. Let (M,g) be an oriented Riemannian manifold. Prove the following identities for 
any vector field X € X(M). 
(a) div X = xd xX’, where div is the divergence operator defined in Problem 
6.1.13. 
(b) ixdVg = «X?. 
6.1.23. Let (M,g) be a Riemannian manifold. Consider the operation that consists of 
xd x d. 
(a) Show that xd «d is an R-linear operator 2*(M) — 0*(M) for k < dim M. 
(b) Let MIR” be a standard Euclidean space. Recalling that 2°(M) = C%(M), 
show that for any smooth function f, 
xdxdf =V"f, 
@? e? 
O(a)? 7 


where V? is the usual Laplacian V? = 
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(c) Find an expression in coordinates of xd*df for smooth functions f : S* > R, 
express in (u,v) longitude-latitude coordinates as used in Example 6.1.14. 


6.2 Connections and Covariant Differentiation 
6.2.1 Motivation 


Despite all the “heavy machinery” we have developed in order to create a theory of 
analysis on manifolds, we are still unable to calculate or even define certain things 
that are simple in R”. 

For example, if 7 : [a,b] > M is a smooth curve on a smooth manifold, we have 
no way at present to talk about the acceleration of 7. Let p € M, with y(to) = p. 
In Definition 3.3.1, we presented the tangent vector 7/(to) at p as the operator 
D, :C!(M) — R that evaluates 


Dy(f) = Zhe) 


to 

In Section 3.3, we developed the linear algebra of expressions D. for curves y 
through p. The vector space of such operators is what we called the tangent space 
T,M. 

Mimicking what one does in standard calculus, one could try to define the ac- 
celeration vector 7’"(to) at p as a limiting ratio as t > to of 7/(t) — 7 (to) with t— to. 
However, what we just wrote does not make sense in the context of manifolds be- 
cause 7/'(t) and y/(to) are not even in the same vector spaces and so their difference 
is not defined. 

Another attempt to define the acceleration might follow Definition 3.3.1 and try 


to define 7"(to) at p as the operator DY) : C?(M) > R, where 


Df) = Shy 


This operator is well defined and linear. However, Dp?) does not satisfy Leibniz’s 
rule, and therefore, there does not exist another curve ¥, with 4(to) = p such that 
DY? = Ds. Hence, DY? ¢ T,M. We could study properties for operators of the 
form DY?) but, since the operators do not exist in any TM®? 2 TM*®4, this is not 
the direction the theory of manifolds developed. 

Another lack in our current theory is the ability to take partial derivatives or, 
more generally, directional derivatives of a vector field. Over R”, it is easy to define 
OF /Ox), where F is a vector field, and, under suitable differentiability conditions, 
oF /Ox/ is again another vector field. In contrast, if X is a vector field over a smooth 
manifold M and U is a coordinate neighborhood of p € M, one encounters the same 
problem with defining a vector 0;X, as one does in defining the acceleration of a 
curve. 
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A more subtle attempt to define partial derivatives of a vector field X on M ina 
coordinate chart would be to imitate the exterior differential of forms (see Definition 
5.4.4) and set as a differential for X the quantity 

Ox’ 


Ox? : . 
J _— ; J 
( 5a!) © = ae ae’ 


However, this does not define a tensor field of type (1,1) on M. It is easiest to see 
this by showing how the components violate the transformational properties of a 
tensor field. Let = (z',...,2”) be another system of coordinates that overlaps 
with the coordinate patch for x = (x!,..., 2”). Call X the components of the vector 


field X in the system. We know that 


Xi(z) = on (a). 


Taking a derivative with respect to £* and inserting appropriate chain rules, we 
have 
ORF = 0) LOR Vs OR OX 0 OG a OG! On OX 
Ozk ORF \ Ox! 


The presence of the first term on the right-hand side shows that the collection of 
functions 0;X* do not satisfy the coordinate change properties of tensors given in 
(5.8). 


6.2.2. Connection ona Vector Bundle 


To solve the above conceptual problems, we need some coordinate-invariant way to 
compare vectors in tangent spaces at nearby points. This is the role of a connection. 
A connection on a smooth manifold is an additional structure that, though we 
introduce it in this chapter, is entirely independent of any Riemannian structure. 
We can in fact define a connection on any vector bundle over M. Since we require 
this generality for our applications, we introduce connections in this manner. 


Definition 6.2.1. Let M be a smooth manifold, and let € be a vector bundle 
over M. Let €(€) denote the subspace of I'(€) of smooth global sections of €. A 
connection on € is a map 


V : X(M) x €(§) > €(€), 
written VxY instead of V(X,Y), that satisfies the following: 


1. For all vector fields Y € €(€), V(_,Y) is linear over C™(M), ie., for all 


fig ec™(M), 
VexagrY = FVxY + 9VzY. 
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2. For all vector fields X¥ € X(M), V(X, _) is linear over R, ie., for all a,b ER, 


Vx(aY + bY) =aVxY + bVxY. 


3. For all vector fields X € X(M), V(X, _) satisfies the product rule 
Vale UY Vx 
for all fe C™(M). 


The vector field Vx Y in €(&) is called the covariant derivative of Y in the direction 
of X. 


The symbol V is pronounced “del.” The defining properties of the covariant 
derivative are modeled after the properties of directional derivatives of vector fields 
on R” (see Problem 6.2.1). Intuitively, the connection explicitly defines how to take 
a partial derivative in €(€) with respect to vector fields in TM. In fact, Problems 
6.2.3 and 6.2.4 show that VY depends only on the values of X, in TM and the 
values of Y in a neighborhood of p on M. Therefore V code is truly a directional 
derivative of Y at p in the direction X,. Hence, we often write Vx,Y instead of 
Vx ‘a 

For the applications in differential geometry, we will usually be interested in 
using connections on vector bundles of the form € = TM®"@®TM*®s. As it will turn 
out, connections on these vector bundles are closely related to possible connections 
on TM. Therefore, we temporarily restrict our attention to connections 


V :X(M) x X(M) > X(M). 


Over a coordinate patch U of M, the defining properties are such that V is com- 
pletely determined once one knows its values for X = 0; and Y = 0;. Since Va,0; 
is another vector field in M, we write 


Va,0; =T Ep. (6.15) 
The components rk are smooth functions M — R. 


Definition 6.2.2. The functions rk in Equation (6.15) are called the Christoffel 
symbols of the connection V. 


As it turns out, there are no restrictions besides smoothness on the functions 
k 
ieee 


Proposition 6.2.3. Let M” be a smooth manifold, and let U be a coordinate patch 
on M. There is a bijective correspondence between connections on X(U) and col- 
lections of n> smooth functions rk defined on U. The bijection is given by the 
formula 


Vx¥ = (X'OY* +TEX'Y)G,. (6.16) 
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Proof. First, suppose that V is a connection on X(U) and let X,Y € X(U). Then 
by the relations in Definition 6.2.1, 


VxY = V xia, (Y70;) = X*Vo,(Y79;) 
= X"(0:Y!)0; + X'Y!V 9,0; = X*(O;Y7)0; + X'VITH Op. 


Equation (6.16) holds by changing the variable of summation from j to & in the 
first term of the last expression. Conversely, if [¥. are any smooth functions on U 
and if we define an operator X(U) x X(U) + X(U) by Equation (6.16), it is quick to 
check that the three criteria of Definition 6.2.1 hold. Thus, Equation (6.16) defines 
a connection on X(U). 


At a first pass, the definition of a connection may seem rather burdensome 
and unintuitive. However, the component description given in (6.16) has the same 
format of something we have already seen. We encountered the same formula in 
(2.11) in the context of calculating partial derivatives of the components of a vector 
field expressed in reference to a variable frame in R”. In (2.11), the component 
functions rk precisely play the role described in (6.15). Consequently, in developing 
the concept of connections on the tangent bundle to a manifold, we could have 
started from (6.15) and worked back to the properties listed in Definition 6.2.1. 
Definition 6.2.1 is therefore simply a coordinate-free description of (6.16), which 
arose from a relationship that first appears in the analysis of moving frames in R”. 

It is important to point out that the Lie derivative is not a connection because 
it violates the first criterion in Definition 6.2.1, namely the Lie derivative Lx Y is 
only R-linear in X as opposed to C'°(M)-linear. 


Example 6.2.4 (The Flat Connection on R"). In R”, the vector fields 0; are 

constant, and we identify them with the standard basis vector €;. According to 

Proposition 6.2.3, a connection exists for any collection of n° functions. However, 

if Y = Y/0; is a vector field in R”, our usual way of taking partial derivatives of 

vector fields is avi 

ys 

VaY= > 

a Ox? 

which takes partial derivatives componentwise on Y. By (6.16), we see that lM, = 

0 for all choices of the indices. A connection with this property is called a flat 
connection over the coordinate patch. 


0;, 


Even though the symbols T', resemble our notation for the components of a 
(1, 2)-tensor, a connection is not a tensor field. The reason derives from the fact 
that 0;X°* is not a (1,1)-tensor field. In fact, from (6.14) and the transformational 
properties of a vector field between overlapping coordinate systems on M, we can 
deduce the transformational properties of the component functions of a connection. 


Proposition 6.2.5. Let V be a connection on X(M). Suppose that U and U are 
overlapping coordinate patches, and denote by 1; and Tl, the component functions 
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of V over these patches, respectively. Then over UU, the component functions 
are related to each other by 


ne, os Ox) Ox" Oz! ., Ox) Oak dz! 
mn az Oz" Oxi I® — AE™ BZ" Axidxk’ 


Proof. (Left as an exercise for the reader.) 


The astute reader might have noticed already from Definition 6.2.1 that a con- 
nection is not a tensor field of type (1,2). If an operator F : X(M)x X(M) > X(M) 
were a tensor field in TM @ TM*®?, then F'(X,_) would be linear in C*°(M) and 
would not satisfy the third property in Definition 6.2.1. 


Example 6.2.6 (Polar Coordinates). We consider the connection V on R? that 
is flat over the Cartesian coordinate system. We calculate the components of V 
with respect to polar coordinates. We could calculate the Christoffel symbols from 
Proposition 6.2.3, but instead, we use Proposition 6.2.5. Set 2! = 2,2? =y, Z! =r, 
and &? = 6, and denote by T ok = 0 the Christoffel symbols for the flat. connection 
on R? and let T’,,,, denote the Christoffel symbols for V in polar coordinates. 

By direct calculation, 


Ox) Ox* 07x? 
Ens Oz! OX? OxIOxk 


a2 
ie) = 


Ox? OxOy OyOx Oy? 


2xry 
(x? + y?)2 


(2sin” 6 cos” 6 + (cos? 6 — sin? 0)? + 2sin? 6 cos? 0) = 


26 2 26 26 
—r snapenngos” +r? cos? pee —rsin? pee: +rsin 6 cos 0) 


y2 — x? 


+ r(cos? @ — sin? 6) r sin @ cos 6 


—rsin 6 cos 6 


I 


ll 
| 
TN TN 


1 1 
r r 


It is not hard (though perhaps a little tedious) to show that 
Ti, =9, Ti. =Py, = 9, Py. =, 


MM, = 0, Mo _ M34 meee To a 


We now wish to extend our discussion of connections on TM to connections on 
any tensor bundle TM®” @ TM*® in a natural manner for any pair (r,s). Two 
situations are settled: (1) if f € TM° = C~(M), then we want Vx f = X(f), the 
expected directional derivative; and (2) if X € TM, then the connection should 
follow the properties described in Definition 6.2.1 and Proposition 6.2.3. 


Lemma 6.2.7. Let M be a smooth manifold, and let V be a connection on TM. 
For each pair (r,s) € N?, there exists a unique connection on the tensor bundle 


x 2xry 
(x? + y?)? (2? + y?)? 


) 
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TM®" @TM*®5, also denoted V, given by the following conditions for any vector 
field X: 


1. Consistency: V is equal to the connection given on TM. 
2. Directional derivative: Vx f = X(f) for all f € C’(M) =TM?®. 
8. Contraction product rule: for all covector fields w and vector fields Y 
Vx((Y)) = (Vxw)(Y) +w(VxY). 
4. Tensor product rule: for all tensor fields A and B of any type, 
Vx(A® B) =(VxA)®B+A®(VxB). 


We omit the proof of this lemma since it is merely constructive. Property 3 
determines uniquely how to define V xw for any covector field and then Property 4 
extends the connection to all other types of tensors. 


Definition 6.2.8. Let M be a smooth manifold. We call V an affine connection 
on TM®" @ TM*®* if it satisfies the conditions of Lemma 6.2.7. 


6.2.3. Covariant Derivative 


Let V be an affine connection on a smooth manifold M. Let F be a tensor field 
of type (r,s). Then the mapping VF that maps a vector field X to VxF isa 
C(M)-linear transformation from X(M) to the space of tensor fields of type (r, s). 
Thus, for each p€ M, VF |, is a linear transformation T,M — T,M®" ®T,M*®s, 


so by Proposition 4.4.10, 

VF|,, € Hom(T,M, T,M®" @ TpM*®*) = T,M® @ T,M*2C4D. 
Furthermore, since VF | varies smoothly with p, then VF is a smooth section of the 
tensor bundle T, M®" @ T,M*®s*1, and hence, it is a tensor field of type (r,s + 1). 


Definition 6.2.9. Let M be a smooth manifold equipped with an affine connection 
V. If F is a tensor field of type (r,s), then the tensor field VF of type (r,s +1) is 
called the covariant derivative of F. 


Proposition 6.2.10. Let F be a tensor field of type (r,s) over a manifold M. Sup- 
pose that F has components tae over a coordinate chart U. Then the components 
of the covariant derivative VF are 


1% dp Si 
Fuctr def aren af a [‘« F b1ete—1piatic tr 
jrvjsik ~ me hu E jie 


ore ie Cea reer re (6.17) 
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(Some authors use the notation Fa - i , for the components of the covariant 
derivative.) The notation in fieustion 6 17) is a little heavy, but it should be- 
come clear with a few examples. If w is a 1-form, then Vw is a 2-form with local 
components given by 


a j k oo _ pe 
Vw = w;.7dx? @ dx", where w3.~ = Opwy Dy jen: 


Similarly, if Aj’ are the components of a (2, 1)-tensor field A, then VA is a (2, 2)- 
tensor field with local components given by 


aAy 


VA = Ag; @ 0; @de* @da', where Ag, = 7 


+ Tj, AM +14 Ae — ThA’. 


6.2.4 Levi-Civita Connection 


Proposition 6.2.3 gives considerable freedom in choosing the components of a con- 
nection. In the context of Riemannian geometry, it is natural to wish for a con- 
nection that is in some sense “nice” with respect to the metric on the manifold. 
The following theorem is motivated by results in classical differential geometry of 
surfaces discussed in [5, Section 7.2] but is so central to Riemannian geometry that 
it is sometimes called the “miracle” of Riemannian geometry [45]. 


Theorem 6.2.11 (Levi-Civita Theorem). Let (M,g) be a Riemannian manifold. 
There exists a unique affine connection V that satisfies the following two conditions: 


1. Compatibility: Vg is identically 0. 
2. Symmetry: for all X,Y € X(M), [X,Y] =VxY —VyX. 


A few comments are in order before we prove this theorem. The condition that 
Vg = 0 intuitively says that V is flat with respect to the metric. We say that V 
is compatible with the metric. We leave it as an exercise for the reader (Problem 
6.2.12) to show that if we write g = (, ), then Vg is identically 0 (i.e., gi;.4 = 0 in 
local coordinates) if and only if 


Vx((Y, Z)) =(VxY,Z) + (Y,VxZ). (6.18) 


Hence, if V is compatible with the metric g, then it satisfies a product rule with 
respect to the metric. 

By Problem 6.2.14, condition 2 implies that over any coordinate patch of the 
manifold, the Christoffel symbols I ik of the connection V satisfy I To =, i which 
justifies the terminology of a Syaneneenic connection. 


Definition 6.2.12. The connection V described in Theorem 6.2.11 is called the 
Levi-Civita connection or the Riemannian connection with respect to the metric g 
on M. 
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Proof of Theorem 6.2.11. Let X,Y,Z € X(M), and denote g = (, ). Since (X,Y) 
is a smooth function on M, then we write Vz((X,Y)) = Z(X,Y). 
Now suppose that such a connection V exists. Then 


XY, Z) = (WVxY, Z) + (Y,VxZ), (6.19) 
Y(Z,X) = (VyZ,X) + (Z, VX), (6.20) 
Z(X,Y) =(V2X,V) + (X,V2Y). (6.21) 
Adding Equations (6.19) and (6.20) and subtracting Equation (6.21), using the 


symmetry of the metric, we get 
X(Y,Z) + Y(Z,X) — Z(X,Y) 
= (VxY — VyX,Z) + (VxZ—VzZX,Y) +(VyZ— Vay, X) +2(Z,VyX) 
Using the fact that V is symmetric, we have 
X(Y,Z) + Y(Z,X) — Z(X,Y) 
= ([X,Y], 2) + ([X, 2], Y) + (IY, 2], X) + 2(4, Vx), 
and thus 


(Z,VxY) = =(X(Y,Z)+Y(Z,X) — Z(X,Y) 


Nir 


mes (P99) le 2 reee (0, Sef ee eA (6.22) 


Now a connection on any coordinate patch is uniquely determined by its Christoffel 
symbols. However, setting X = 0;, Y = 0; and Z = Ox, (6.22) gives a method 
to obtain the Christoffel symbols of V strictly in terms of the metric. Hence, if a 
connection as described in the theorem exists, then it is unique. 

To show that such a connection exists, simply start by defining V using the 
identity in (6.22). Then it is not hard to show that the connection is both symmetric 
and compatible with g. 


Proposition 6.2.13. Let (M",g) be a smooth Riemannian manifold. Then over a 
coordinate patch of M with coordinates (x!,...,2"), the Christoffel symbols of the 
Levi-Civita connection are given by 


6 wl 4 (Og , Og — O95e 
m= 259 (3 " Oxck Oat)’ Cee 


I=1 


where g") are the entries to the inverse matrix of (gxi). 


Proof. Set g = (,), and let X = 0;, Y = Oj, and Z = O,. By the Levi-Civita 
connection defined in (6.22), we have 


. 1 
(a.315a) = 3 (9:60; Ox) + 0; (Ox, Oi) — Ox (9; 95) 


— (0, O5], Ox) — ([0i, Ok], 3) — ([05, x], Oi). 
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However, the smoothness condition implies that [0;,0;] = 0 for any indices i,j. 
Furthermore, by definition, gi; = (0;,0;), so by the linearity of the metric on the 
left-hand side, 


“ 1 
SS onl; = (Digi +O; 9%: — On9i3) - 


The proposition follows by multiplying (and contracting) by g’’, the components of 
the inverse of (gxi). 


The reader who is familiar with the differential geometry of surfaces has already 
seen Proposition 6.2.13 but in a more limited context. In Section 7.2 of [5], the 
authors talk about Gauss’s equations for a regular surface over a parametrization 
X. In that section, one sees that even though the normal vector to a surface is 
not an intrinsic property, Xe . X, is intrinsic and in fact is given by the Christoffel 
symbols of the first kind, which are precisely those in Equation (6.23), though with 
n = 2. This is not a mere coincidence. In defining the Levi-Civita connection, 
that we might want V to be compatible with g made intuitive sense. However, the 
stipulation that we would want V to be symmetric may have seemed somewhat 
artificial at the time. It is very interesting that the two conditions in Theorem 
6.2.11 lead to Christoffel symbols that match those defined for surfaces in classical 
differential geometry. 

It is possible to develop a theory of embedded submanifolds M™ of R” following 
the theory of regular surfaces in R?. Mimicking the presentation in [5, Section 7.2], 
if X isa aaa as of a coordinate patch of M, then, by setting 


po => eee ~ gor + (Normal component), 


the components ry, are again the Christoffel symbols of the second kind, given 
by the same formula in (6.23). This shows that for submanifolds of a Euclidean 
space, the Levi-Civita connection on a Riemannian manifold is essentially the flat 
connection on R” restricted to the manifold. 

One of the beauties of the condition that Vg = 0 is that the process of raising 
and lowering indices commutes with taking the covariant derivative associated to the 
Levi-Civita connection. In components, this means for example that if Ai = g’' Aju, 
then 


Aj, sk = = "AGI; sks 
This follows because in components Vg = 0 identically means that g;;,, = 0 for the 
Levi-Civita covariant derivative. Then since g;;g/’ = 6) is a numerical tensor, we 


have 
O= by = guin9” + 969%, = GT es 


which implies that gh since gj; is invertible. Thus, in our specific example, 


Abn = 924g + 9° Agtie = 9" Agi;e- 
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In this example we raised the index, but it is clear from the product rule and the 
fact that Vg = 0 and Vg~! = 0, that this property holds for all tensors. 


6.2.5 Divergence Operator 


We finish this section with a comment on the divergence operator on tensors intro- 
duced in Problem 6.1.13. We will show in Problem 6.2.16 that, using the Levi-Civita 
connection, the divergence operator on a vector field X € X(M) can be written as 


div X = X",. (6.24) 


This motivates, first, the definition of the divergence of any tensor T of type (r,s), 
with r > 1, on a Riemannian manifold. If T has components hee .;, ina coordinate 
system, then the divergence of T, written divT or V - T, is the tensor field of type 
(r — 1,8) with component functions 


ate aig: 
Tes 30 = MG, sae 


Similarly, we can take the divergence with respect to any contravariant index but 
we must specify which index. If the index is not specified, we assume the divergence 
is taken with respect to the first index. 

We can also define the divergence of a covariant index by raising that index first. 
Thus, for example, if w is a 1-form, then 


divw = (g'4w,).:. (6.25) 


Problem 6.2.16 shows that whether one raises the index before or after the covariant 
derivative is irrelevant. 


PROBLEMS 


6.2.1. Consider the special case of the manifold M = R*. Let X be the constant vector 
field @, and let X(IR*) be the space of vector fields R® — R°. Show that the usual 
partial derivative Ds applied to X(R*) satisfies conditions 2 and 3 of Definition 
6.2.1. 


6.2.2. Recall the permutation symbol defined in (4.35). Let M be a three-dimensional 
manifold equipped with a symmetric affine connection. Let A and B be vector 
fields on M. Show that 


and that 7 7 - 
(c)* A; Br), = oF A By = eU* Ay Byui. 
If M = R®, explain how the latter formula is equivalent to V - (A x B) = (V x 
A): B-—A-(V xB). 
6.2.3. Let V be a connection on a vector bundle € over a smooth manifold M. Prove 
that if X = X and Y = Y over a neighborhood of p, then 


VxY|, = VY, 
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6.2.4. 


6.2.5. 
6.2.6. 


6.2.7. 


6.2.8. 


6.2.9. 


6.2.10. 


6.2.11. 
6.2.12. 


6.2.13. 


Let V be a connection on a vector bundle € over a smooth manifold M. Use the 
result of Problem 6.2.3 to show that V code depends only on X, and the values 
of Y in a neighborhood of p. 

Prove Proposition 6.2.5. 

Prove that the Levi-Civita connection for the Euclidean space R” is such that 
VxY = X(Y")dp. 


Consider the open first quadrant U = {(u,v) € R?|u > 0, v > 0}, and equip U 
with the metric 


1 1 
V u2 v2 
(93) = 1 i” 


Wareer ue 
Calculate the Christoffel symbols for the associated Levi-Civita connection. 


Let M be a two-dimensional manifold, and suppose that on a coordinate patch 
(a, 2), the metric is of the form 


re Ge 2 where r? = (21)? + (2”)?. 


Find the function f(r) that gives a flat connection. 


Consider the cylinder in S? x R in R* given by the parametrization 
1,2 ADP ey eer ae) 205 
F(w yu ,u’) =(cosu sinu’,sinw sinu~,cosu wu’) 


and equip it with the metric induced from R*. Over the open coordinate patch 
U = (0,2) x (0,7) x R, calculate the metric coefficients and the Christoffel 
symbols for the Levi-Civita connection. 


Consider the unit sphere S? as a submanifold of R* with the induced metric. 
Consider the coordinate patch on S* given by the parametrization in 6.1.14(a). 
Calculate one nonzero Christoffel symbol ry ,- (It would be quite tedious to calcu- 
late all of the symbols since there could be as many as 27 of them.) [Hint: Show 
that the conditions of Problem 6.2.13 apply to this coordinate patch and use the 
result. 


Finish calculating directly the Christoffel symbols in Example 6.2.6. 
Let (M,g) be a Riemannian manifold. Prove that a connection V satisfies Vg = 0 
identically if and only if (6.18) holds where g = (, ). 
Let (M, g) be a Riemannian manifold and let U be an orthogonal coordinate patch, 
ie., gig =O if t A 7 over U. Let V be the Levi-Civita connection on M. 
(a) Prove that on U the Christoffel symbols I'¥, = 0 unless k = i, i = j, ork = j. 
(b) Show that V can be specified on U by 2n? — n smooth functions, i.e., there 
are at most that many distinct nonzero Christoffel symbols. 
(c) Show that 


1 pp Ogi k 1 kk O9kk 
| ee ee d y= = : 
at 99 Oxk ’ an tk 99 Ox? 
where there is no summation in either of these formulas and where the sign 
of tis+lifi=kand—-1ifiFk. 
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6.2.14. 


6.2.15. 


6.2.16. 


6.2.17. 


6.2.18. 


6.2.19. 


Let M be a smooth manifold, and let V be a connection on TM. Define a map 
7:X(M) x X(M) > X(M) by 


XY) = VeY = Ve — GY], 


(a) Show that 7 is a tensor field of type (1,2). This is called the torsion tensor 
associated to the connection V. 


(b) Prove that the components of 7 with respect to a basis are Th = ry, - r¥.. 


(c) The connection V is called symmetric if its torsion vanishes identically. De- 
duce that V is symmetric if and only if over every coordinate patch U, the 
component functions satisfy ry, — ry. 


Let V be an affine connection on M. Prove that V + A is an affine connection, 
where A is a (1, 2)-tensor field. Conversely, prove that every affine connection is 
of the form V + A for some (1, 2)-tensor field A. 


Consider the divergence operator introduced in Problem 6.1.13 and discussed at 
the end of this section. 


(a) Show from the definition in Problem 6.1.13 that 
div X S Xe; 
where we’ve used the Levi-Civita connection to take the covariant derivative. 


(b) Consider the definition in (6.25) for the divergence on a 1-form. Show that 
divw = (g™wys);¢ = gi wy.s. 


Let f € C~(M) be a smooth function on a manifold M equipped with any affine 
connection V. Show that 


k 
Fis — Fass = —Tig foes 
where 7 is the torsion tensor from Exercise 6.2.14. Conclude that if V is symmet- 
ric, then fii.5 = fiji. 
Let M be a smooth manifold, let 7 € 9?(M) be a 2-form, and let V be any 
symmetric connection on M. Show that in any coordinate system, 


Copy = NoByy + Naya + Mya; = OyNas + Oanpy + Osta: (6.26) 


Show that if we write 7 = 4nagdx® \ dx*, then the left-hand side of (6.26) is the 
component of dn in the basis dx® A dx® A dx” in the sense that 


dw = Capydx® A dx? \ dx”, 


where we sum over all a, 8,y =1,...,n. 


Let (M,g) be a Riemannian metric with Levi-Civita connection V. Show that 
over every coordinate patch, 


O(n Yet g) _ PY 


where one sums over j on the right-hand side. [Hint: Use a result in Problem 
2.3.12.| 
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6.2.20. Let V be an affine connection on M, and let U be a coordinate patch on M. 


(a) Show that there exists a unique matrix of 1-forms w! defined on U such that 


V xd; = wi (X)O; 


for all X € X(M). (The matrix w? is called the connection 1-forms for this 
coordinate system.) 


(b) Suppose that (M,g) is a Riemannian manifold. Show that V is compatible 
with the metric g if, over any coordinate system U, 


gir + gikw; = dgi;. 


6.3 Vector Fields along Curves; Geodesics 


Suppose we think of the trajectory of a particle on a manifold M. One would 
describe it as curve y(t) on M. Furthermore, in order to develop a theory of 
dynamics on manifolds, one would need to be able to make sense of the acceleration 
of the curve or of higher derivatives of the curve. In this section, we define vector 
fields on curves on manifolds. Once we define a covariant derivative of a vector 
field on a curve, we can then discuss parallel vector fields on the curve and the 
acceleration field along the curve. We then show that defining a geodesic as a curve 
whose acceleration is identically 0 leads to the classical understanding of a geodesic 
as a path of minimum length in some sense. 


6.3.1 Vector Fields along Curves 


Definition 6.3.1. Let M be a smooth manifold, and let y : J — M be a smooth 
curve in M, where J is an interval in R. We call V a vector field along y if for each 
t¢JI, V(t) is a tangent vector in T,)M and if V defines a smooth map J > TM. 
We denote by X.(M/) the set of all smooth vector fields on M along y+. 


A vector field along a curve is not necessarily the restriction of a vector field 
on M to 7(Z). For example, whenever a curve self-intersects, y(to) = y(ti), with 
to # ti, but since V(to) A V(ti) there exists no vector field Y on M such that 
V(t) = Yq) for all t € I (see Figure 6.3). If V is the restriction of a vector field Y, 
then we say that V is induced from Y or that V extends to Y. 


Proposition 6.3.2. Let M be a smooth manifold with an affine connection V, 
and let y : I + M be a smooth curve on M. There exists a unique operator 
Dy: X4(M) + X,(M) (also denoted by £) such that: 


1. Di(V+W) = DV + DW for all V,W € X,(M). 


2, Di(fV) = £V + fDi for all V € X,(M) and all f € C%(I). 
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Figure 6.3: A nonextendable vector field on a curve. 


3. If V extends to a vector field Y € X(M), then DipV = Vyay¥. 


Note that the last condition makes sense by the fact that V coal only depends 
on the values of Y in a neighborhood of p and on the value of X, (see Exercise 
6.2.4). 

Before proving Proposition 6.3.2, we introduce the dot notation for derivatives. 
The only purpose is to slightly simplify our equations’ notation. If x(t) is a real- 
valued function of a real variable, we write 


s(t) = da = Pr 
dt dt? 
The dot notation is common in physics in the context of taking derivatives with 


respect to time. Therefore, & is usually used when one uses the letter ¢ as the only 
independent variable for the function 2. 


and = &(t) © x(t) 


Proof of Proposition 6.3.2. Let us first suppose that an operator D; with Properties 
1-3 exists. Let U be a coordinate patch of M with coordinates x = (x!,..., a”). 
For any V € X,(M), write V = v'0; where v’ € C(I) are smooth functions over 
I. By Conditions 1 and 2 we have 


DiV = 010; + v! D,(0;). 
Now if we write y(t) = (71(t),...,7"(t)) for the coordinate functions of y over U, 
then 7/(t) = 37", 7'0;. Thus, by Condition 3, 


Di(95) = Vr 5 =D F'V 9.95 = YT Oe. 


i=1 
Hence, we deduce the following formula for D,V in coordinates over U: 


dvi dy. ae 
DV = ( mn a;) (rf 7 dt) = (oF + 15,40") Op. (6.27) 
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Equation (6.27) shows that if there does exist an operator satisfying Conditions 
1-3, then the operator is unique. To prove existence over all of 7, we define D? by 
(6.27) on each coordinate chart U,. However, since D? is unique on each coordinate 
chart, then DP = De over U, Ug if U. and Ug are overlapping coordinate charts. 
Hence, as a ranges over all coordinate charts in the atlas, the collection of operators 
D? extends to a single operator D; over all of M. 


Note that Equation (6.27) in the above proof gives the formula for D; over a 
coordinate patch of M. In particular, the expression in the parentheses on the right 
gives the component functions (in the index k) for D:V. 


Definition 6.3.3. The operator D; : X)(M) + X.,(M) defined in Proposition 6.3.2 
is called the covariant derivative along y. 


In the context of Riemannian manifolds, the covariant derivative along a curve 
has the following interesting property. 


Proposition 6.3.4. Let y be a smooth curve on a Riemannian manifold (M, g) 
equipped with the Levi-Civita connection. Write g = (,). Let V and W be vector 
fields along y. Then 


S(V,W) = (DV,W) + (V, DW). 


Proof. (Left as an exercise for the reader. See Problem 6.3.9.) 


The notion of a vector field along a curve (in a manifold M) leads us immediately 
to two useful notions: parallel transport and acceleration. 


Definition 6.3.5. Let M be a smooth manifold with an affine connection V, and 
let y: 1 — M be asmooth curve on M. A vector field V along 7¥ is called parallel 
if DpV = 0 identically. 


The existence of parallel vector fields on a curve amounts to the solvability of a 
system of differential equations. 


Proposition 6.3.6 (Parallel Transport). Let M be a smooth manifold with an 
affine connection V, and let y : I + M be a smooth curve on M where I, is a 
compact interval of R. Let to € I, set p = y(to), and let Vo be any vector in T,M. 
There exists a unique vector field of M along y that is parallel and has V (tg) = Vo. 


Proof. Suppose first that M is a manifold that is covered with a single coordinate 
system x = (a1,...,2"). By (6.27), the condition D,V = 0 means that 


+i ¥'v! =0 for all k =1,...,n. (6.28) 


The values If, depend on the position of 7(¢) as do the derivatives ¥'(#), but nei- 
ther of these depend on the functions v'(t). Hence, (6.28) is a system of linear, 
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Figure 6.4: Path dependence of parallel transport. 


homogeneous, ordinary, differential equations in the n functions v(t). By a stan- 
dard result of ordinary differential equations (see [18, Appendix A] or [8, Section 
8]), given an initial value t = to and initial conditions v'(to) = vj, there exists a 
unique solution to the system of equations satisfying these initial conditions. (The 
particular form of the nonautonomous system from (6.28) and the hypothesis that I 
is compact imply that the system satisfies the Lipschitz condition, which establishes 
the uniqueness of the solutions.) Hence V exists and is unique. 

Now suppose that MM cannot be covered by a single coordinate chart. We only 
need to consider coordinate charts that cover 7(I). But since y(J) is compact, we 
can cover it with only a finite number of coordinate charts. However, on each of 
these charts, we have seen that there is a unique parallel vector field, as described. 
By identifying the vector fields over each coordinate chart, we obtain a single vector 
field over all of 7 that is parallel to Vo. 


Definition 6.3.7. The vector field V in Proposition 6.3.6 is called the parallel 
transport of Vo along ¥. 


It is important to note that the parallel transport of Vo from a point p to a 
point q along two different paths generally results in different vectors in T,M. In 
Figure 6.4, the tangent vector Vo at p produces different tangent vectors at gq when 
transported along the black curve versus along the gray curve. One says that parallel 
transport is nonintegrable. However, it is not hard to see, either geometrically or 
by solving Equation (6.28), that in R” parallel transport does not depend on the 
path. Therefore, this nonintegrability of parallel transport characterizes the notion 
of curvature, as we will see in the following section. 
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As a second application of the covariant derivative along a curve, we finally 
introduce the notion of acceleration of a curve on a manifold. 


Definition 6.3.8. Let M be a smooth manifold with an affine connection and let 
y:I— M beasmooth curve on M. For all t € I, we define the acceleration of + 
on M as the covariant derivative D,y’(t) of y'(t) along y. 


Example 6.3.9. With the definition of the acceleration, we are in a position to be 
able to phrase Newton’s second law of motion on a manifold. In R?, Newton’s law 
states that if a particle has constant mass m and is influenced by the exterior forces 
F,, then the particle follows a path X(t) that satisfies >, F, = m2". Translated into 
the theory of manifolds, if a force (or collection of forces) makes a particle move 
along some curve y, then writing F' as the vector field along y that describes the 
force, y must satisfy 
mD,y (t) = F(t). 


The acceleration is itself a vector field along the curve y so the notions of all the 
higher derivatives are defined as well. 


6.3.2 Geodesics 


Intuitively speaking, a geodesic on a manifold is a curve that generalizes the notion 
of a straight line in R”. This seemingly simple task is surprisingly difficult. Only 
now do we possess the necessary background to do so. Though everyone has an 
intuitive sense of what a straight line is, even Euclid’s original definitions for a 
straight line do not satisfy today’s standards of precision. We introduce geodesics 
using two different approaches, each taking a property of straight lines in R” and 
translating it into the context of manifolds. 


Definition 6.3.10. Let M be a smooth manifold with an affine connection V. 
A curve y : I + M is called a geodesic if its acceleration is identically 0, i-e., 
Diy (t) = 0. 


Note that this definition does not require a metric structure on M, simply an 
affine connection. We should also observe that this definition relies on a specific 
parametrization of y. The definition is modeled after the fact that the natural 
parametrization of a straight line in R” by 7(t) = p+ td for constant vectors p and 
@ satisfies ¥/(t) = 0. However, the curve Z(t) = 7 + t°0 traces out the same set 
of points but #/’(t) = 3t?%, which is not identically 0. Despite this, we can leave 
Definition 6.3.10 as it is and keep in mind the role of the parametrization. 
Proposition 6.3.11 (Geodesic Equations). Let M be a smooth manifold equipped 
with an affine connection, and let x = (x',...,2") be a system of coordinates on 
a chart U. A curve y is a geodesic on U if and only if the coordinate functions 


y(t) = GQ); .--,7"(t)) satisfy 
ae rt t =0 ii 6.29 
ae tli) Ge Ge for alli. (6.29) 
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Proof. This follows immediately from (6.27). 


Equation (6.29) for a geodesic is a second-order system of ordinary differential 
equations in the functions y'(t). Setting v’(t) = +’, we can write (6.29) as a first- 
order system in the 2n functions y’ and v’ by 


ee is. (6.30) 


if = Ti (a(t) wok. 


This system is now first-order and non-linear but autonomous (does not depend 
explicitly on t). Standard theorems in differential equations [3, Theorems 7.3, 7.4] 
imply the following foundational result. 


Theorem 6.3.12. Let M be a manifold with an affine connection. For any p € M, 
for any V € T,M, and for any to € R, there exists an open interval I containing to 
and a unique geodesic y: I > M satisfying y(to) = p and 7'(to) = V. 


This theorem shows the existence of the curve 7 by solving (6.29) over a coordi- 
nate neighborhood. In this case, the interval J may be limited by virtue of the fact 
that y(I) Cc U. It may be possible to extend y over other coordinate patches. If 
7(t1) for some t, € J is in another coordinate patch U, then we can uniquely extend 
the geodesic over U as going through the point 7(t,) with velocity 7’/(t,). We define 
a maximal geodesic as a geodesic y : J — M whose domain interval cannot be 
extended. If y is a maximal geodesic with y(to) = p and 7(to) = V for some to € J, 
we call y the geodesic with initial point p and initial velocity V € T,M, and we 
denote it by yv. 

Another defining property of a straight line in R” is that the shortest path 
between two points is a straight line segment. If we use the concept of distance, we 
need a metric. Let (M,g) be a Riemannian metric equipped with the Levi-Civita 
connection, and let 7 be a geodesic on M. By Proposition 6.3.4, 


d 


ah) 07H) = 2(Di7'(), (A) = 0, 


so we can conclude the following initial result. 
Proposition 6.3.13. A geodesic on a Riemannian manifold has constant speed. 


Now on a Riemannian manifold, an alternate approach to defining geodesics 
is to call a geodesic a path of shortest length between two points. However, this 
definition is not quite good enough, as Figure 6.5 indicates. Both curves connecting 
p and q are geodesics, but one is shorter than the other. To be more precise, we call 
+ a geodesic connecting p; and pg if there is an interval [¢1, tg] such that y(t1) = pi, 
y(t2) = p2, and y minimizes the arclength integral 


b= [Jason Oat. (6.31) 
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Figure 6.5: Two geodesics on a cylinder. 


Techniques of calculus of variations discussed in Appendix B produce the differ- 
ential equations for the curve 7 that minimizes the arclength. However, similar to 
optimization methods in regular calculus, the solutions we obtain are local minima, 
which means in our case that there are no small deviations of y that produce a 
shorter path between p and q. It is tedious to show, but Theorem B.3.1 implies 
that a curve y that minimizes the integral in (6.31) must satisfy 


d?y* J 
ds?"  J* ds ds 
where s is the arclength of y. Proposition 6.3.11 and Proposition 6.3.13 show 


that defining a geodesic as having no acceleration is equivalent to defining it as 
minimizing length in the above sense. 


j qak 
ee (6.32) 


Example 6.3.14 (Sphere). Consider the parametrization of the sphere given by 


X(a',2”) = (Reosz' sina’, Rsinz’ sinx*, Rcos x”), 


1 2 


where x” is the longitude @ in spherical coordinates and «* is the angle y down 
from the positive z-axis. In Example 6.1.14, we determined the coefficients of the 
metric tensor. Then it is easy to calculate the Christoffel symbols r , for the sphere. 
Equations (6.32) for geodesics on the sphere become 


(6.33) 
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A eeodere on the sphere is now just a curve ¥(s) = X(«1(s),x2(s)) where 
x'(s) and x?(s) satisfy the system of differential equations in (6.33). Taking a first 
derivative of 7(s) gives 


_ dx gua" 
7(s) = R( —sinz' sina? — + cos 2! cos 2? — 
ds ds’ 

soe san" 9 dx? 

cos x! sin x —+sing? cos 77 — , — sin 2* — 

ds ds’ ds 


and the second derivative, after simplification using (6.33), is 


OF = _[sin®(a?) (@) + () Jie. 


2 2 
However, the term R? (siw (x?) (44) + (42) ) is the sphere metric applied to 


((x')'(s), (2*)’(s)), 


which is precisely the square of the speed of 7(s). However, since the geodesic 
is parametrized by arclength its speed is identically 1. Thus, (6.33) leads to the 
differential equation 

Ds 
R219) =0. 
Standard techniques with differential equations allow one to show that all solutions 
to this differential equation are of the form 


7(s) = cos (5) +bsin(+), 


F"(s) + 


where @ and 6 are constant vectors. Note that 7(0) = @ and that 7/(0) = +0. 
Furthermore, to satisfy the conditions that ¥(s) lie on the sphere of radius R and 
be parametrized by arclength, we deduce that d and Db satisfy 


@|=R, |b) =R, and = &@-b=0. 


Therefore, we find that 7(s) traces out a great arc on the sphere that is the inter- 
section of the sphere and the plane through the center of the sphere spanned by 


7(0) and 7’(0). 


There are many properties of lines that no longer hold for geodesics on manifolds. 
For example, lines in R” are (“obviously”) simple curves, i.e., they do not intersect 
themselves. In Example 6.3.14, we showed that the geodesics on a sphere are arcs 
of great circles (equators). In this case, a maximal geodesic is a whole circle that, 
as a closed curve, is still simple. In contrast, Figure 6.6 of a distorted sphere shows 
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Figure 6.6: A nonclosed geodesic on a manifold. 


only a portion of a geodesic that is not closed and intersects itself many times. The 
problem of finding closed geodesics on surfaces illustrates how central the study of 
geodesics is in current research: in 1917, Birkhoff used techniques from dynamical 
systems to show that every deformed sphere has at least one closed geodesic [10]; 
in 1929, Lusternik and Schnirelmann improved upon this and proved that there 
always exist three closed geodesics on a deformed sphere [37]; and in 1992 and 
1993, Franks and Bangert ([23] and [7]) proved that there exist an infinite number 
of closed geodesics on a deformed sphere. However, a proof of the existence of a 
closed geodesic would not necessarily help us construct one for any given surface. 

We end this section by presenting the so-called exponential map. Theorem 6.3.12 
allows us to define a map, for each p € M, from the tangent plane T,M to M by 
mapping V to a fixed distance along the unique geodesic yy. 


Definition 6.3.15. Let p be a point on a Riemannian manifold (MM, g). Let D, be 
the set of tangent vectors V € T,M such that the geodesic yy, with yy (0) =p, is 
defined over the interval [0,1]. The exponential map, written exp,, is the function 
exp, : Dp — M, 
Vro sv (1). 
Lemma 6.3.16 (Scaling Lemma). Let V € T,M, and let c € R*°. Suppose that 
v(t) is defined over (—6,6), with yv(0) = p. Then yv(t) is defined over the 
interval (—d/c,6/c), and 
Yev (t) = yw (ct). 

Proof. (Left as an exercise for the reader. See Problem 6.3.10.) 


By virtue of the scaling lemma, we can write for the geodesic through p along 
V, 
yv (t) = exp, (tV). (6.34) 
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Proposition 6.3.17. For allp € M, there exists a neighborhood U of p on M and 
a neighborhood D of the origin in T,M such that exp, :D — U is a diffeomorphism. 


Proof. The differential of exp, at 0 is a linear transformation d(exp, )o : To(TpM) > 
T,M. However, since T,M is a vector space, then the tangent space To(7,M) is 
naturally identified with T,M. Thus, d(exp,)o is a linear transformation on the 
vector space T,M. The proposition follows from the Inverse Function Theorem 
(Theorem 1.4.5) once we show that d(exp,)o = (exp,)« is invertible. 

We show this indirectly using the chain rule. Let V be a tangent vector in 
V €T,M, and let f : (—d,5) + T,M be the curve f(t) =tV. The function exp, of 
is a curve on M. Then 


However, by the chain rule, we also have 
d(exp, of )o = d(exp,)odfo = (exp,)«V. 


Hence, for all V € T,M, we have (exp,)«V = V. Hence, (exp,). is in fact the 
identity transformation so it is invertible, and the proposition follows. 


Now if {e,} is any basis of TM, the exponential map sets up a coordinate 
system on a neighborhood of p on M defined by 


exp, (X"e,,). 


We call this the normal coordinate system at p with respect to {e,}. If q is a point 
in the neighborhood U, as in Proposition 6.3.17, then q is the image of a unique 
tangent vector X, under exp,. The coordinates of g are X/’. 

Interestingly enough, the coefficients of the Levi-Civita connection vanish at p 
in the normal coordinate system X” at p. Consider a geodesic on M from p to q 
given by c(t) = exp,(tX#e,,), which in coordinates is just X(t) = tX/. From the 
geodesic equation, 


OPE aa AX UX 
dt2 ° ” dt dt 


i LMC GaP Gar. Se 


Setting ¢ = 0, we find that Ty (0) XX" = 0 for any q. Thus, by appropriate choices 
of g, we determine that IY (0) = 0, which are the components of the Levi-Civita 
connection at p in the normal coordinate system. 

The exponential map allows us to redefine some common geometric objects in 
R” in the context of Riemannian manifolds. Notice first that by Proposition 6.3.13, 
the arclength from p to exp,(V) along y(t) is ||V||p. Now, let r > 0 be a positive 
real number and B,(0) be the open ball of radius r centered at the origin in T,M. If 
r is small enough that B,(0) is contained in the neighborhood U from Proposition 
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6.3.17, then we call exp,(B,(0)) the geodesic ball of radius r centered at p. If 
the sphere S,(0) of radius r centered at 0 in T,M is contained in U, then we call 
exp,,(5;(0)) the geodesic sphere of radius r centered at p. 


PROBLEMS 


6.3.1. 


6.3.2. 


6.3.3. 


6.3.4. 


6.3.5. 


6.3.6. 


Let S' be a regular surface in R®, and let X bea parametrization of a coordinate 
chart U of S. Let V be the Levi-Civita connection on S with respect to the 
first fundamental form metric. Let 7(¢) = y(t) be a curve on S. Prove that the 
acceleration D,y‘(t) is the orthogonal projection of ¥’(t) onto the tangent plane 
to S at y(t). 

Consider the torus parametrized by 


X (u,v) = ((a+ bcos v) cos u, (a + bcos v) sin u, bsin v), 


where a > b. Show that the geodesics on a torus satisfy the differential equation 


dr 
du 


1 
= 2 2 2 r 2 
apt C Vb ( a) ’ 


where C is a constant and r= a+ bcosv. 


Find the differential equations that determine geodesics on a function graph z = 
f(x,y). 

IfX:U SR isa parametrization of a coordinate patch on a regular surface S 
such that gi1 = E(u), gi2 = 0, and go2 = G(u), show that 


(a) the u-parameter curves (i.e., over which v is a constant) are geodesics; 
(b) the v-parameter curve u = uo is a geodesic if and only if Gu(uo) = 0; 
(c) the curve £(u, v(w)) is a geodesic if and only if 


v= +f see eee du, 


E( 
JSG(u) — C? 


where C is a constant. 


Pseudosphere. Consider a surface with a set of coordinates (u,v) defined over the 
upper half of the uv-plane, ie., on H = {(u,v) € R?|v > 0}, such that the metric 


tensor is 
1 O 
(giz) = G &*) . 


Prove in this coordinate system that all the geodesics appear in the H as vertical 
lines or semicircles with center on the u-axis. 


Let (M,g) be a two-dimensional Riemannian manifold. Suppose that on a co- 
ordinate patch U with coordinates x = (x, 2”), the metric is given by gi1 = 1, 
goo = (x”)?, and gi2 = go1 = 0. Show that the geodesics of M on U satisfy the 
existence and uniqueness of 


x = asec(x” +b). 
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6.3.7. 


6.3.8. 


6.3.9. 


6.3.10. 
6.3.11. 


Figure 6.7: Mercator projection. 


The Mercator projection used in cartography maps the globe (except the north 
and south poles, S* — {(0,0, 1), (0,0, —1)}) onto a cylinder, which is then unrolled 
into a flat map of the earth. However, one does not necessarily use the radial 
projection as shown in Figure 6.7. Consider a map f from (x,y) € (0,27) x R to 
the spherical coordinates (0, ¢) € S® of the form (0,4) = f(x,y) = (x, A(y)). 


(a) Recall that the usual Euclidean metric on S? is 


= sin?¢ 0 
= (0° 4). 


The Mercator projection involves the above function f(z, y), such that h(y) 
gives a pull-back f*(g) that is a metric with a line element of the form ds? = 
G(y)dx? + G(y)dy?. Prove that h(y) = 2cot~'(e”) works, and determine 
the corresponding function G(y). 


(b) Show that the geodesics on R? equipped with the metric obtained from this 
h(y) are of the form 
sinh y = asin(«# + 8) 


for some constants a and £. 


Consider the Poincaré ball BZ from Problem 6.1.12. Prove that the geodesics 
in the Poincaré ball are either straight lines through the origin or circles that 
intersect the boundary OB% perpendicularly. (The Poincaré ball is an example of 
a hyperbolic geometry. In this geometry, given a “straight line” (geodesic) L and 
a point p not on L, there exists a nonempty continuous set of lines (geodesics) 
through p that do not intersect L.) 


Prove Proposition 6.3.4. [Hint: Use (6.27) and the fact that since the Levi-Civita 
connection is compatible with g, then gij,n = 0.] 
Prove Lemma 6.3.16. 


Consider the usual sphere S? of radius R in R®. In the coordinate patch where 
(9,¢) € (0,27) x (0,7), the Christoffel symbols are given in (6.33) of Example 


6.4. Curvature Tensor 291 


Figure 6.8: A few geodesics in the Poincaré disk. 


6.3.14, where we use the coordinates (0,¢) = («',x”). Consider a point p on 
the sphere given by P = (60,0). Let Vo be a vector in T,S? with coordinates 
(Vo; Vo). 


(a) Show that the stated Christoffel symbols used in (6.33) are correct. 

(b) Calculate the coordinates of the parallel transport V(t) of Vo along the curve 
y(t) = (00,t), using the initial condition to = ¢o. Show that the length of 
the tangent vectors V(t) does not change. 

(c) Calculate the coordinates of the parallel transport V(t) of Vo along the curve 
y(t) = (t,¢0), using the initial condition to = 09. Show that ||V(¢)||? is 
constant. 


6.3.12. Show that the locus of a geodesic on the n-sphere S” (as a submanifold of R"*") 
is the intersection of S” with a 2-plane that passes through the sphere’s center. 


6.4 Curvature Tensor 


In the study of curves and surfaces in classical differential geometry, the shape 
operator and the curvature tensor play a central role. We approach the notion of 
curvature on Riemannian manifolds in two different but equivalent ways. 


6.4.1 Coordinate-Dependent 


The first approach to curvature involves investigating mixed, partial, covariant 
derivatives. For smooth functions in R”, mixed, second-order partial derivatives 
are independent of the order of differentiation. Problem 6.2.17 showed that if a 
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connection V on M is not symmetric, the same result is no longer true for the 
mixed, covariant, partial derivatives of smooth functions on a manifold. We found 
that if f : M — R, then over a given coordinate path U, one has 


k 
Fiji — Sasg = —Ti Fie. 


where 7 is the torsion tensor associated to V (see Problem 6.2.14), and the coordi- 
nate components are 
k k k 
Tj, = Ty, —T ji (6.35) 
If we repeat the exercise with a vector field instead of a smooth function, a new 


phenomenon appears. 


Proposition 6.4.1. Let M be a smooth manifold equipped with an affine connection 
V. Let U be a coordinate patch on M and let X be a vector field defined over U. 
Then, in components, the mixed covariant derivatives satisfy 


i 1 lL yk kyl 
Xi¢ — Xyigg = Big X” — THM 
where 
p On OTe 
ae Aat Ox 
Proof. This is a simple matter of calculation. Starting from Xi, = 0X'/dx' +11, X", 
we obtain 


+TR0, — RT n- (6.36) 


Ox! 
l it l k k yl 
sig = Bap t Pies — PX 
0 Ox! l k ; Ox . 2 é 
~ Oat ( Ox? TTX” | + De Oxi +P imX ST ek 
0X" OV in yk , OX* L pmyk k yl 
~ ackdai | Oxd era Axi Tt Dim Pine” — DG AGR 


After collecting and canceling like terms, we find that 


or! or 
Xiyg— Rhy = (Gat — Se a ar — rts) F=f rh, 


The proposition follows. 


This result is particularly interesting because of the following proposition. 


Proposition 6.4.2. The collection of functions Kip) defined in Equation (6.36) 
form the components of a tensor of type (1,3). 


Proof. (This proposition relies on the coordinate-transformation properties of the 
component functions rm , given in Proposition 6.2.5. The proof is left as an exercise 
for the reader.) 
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The functions Kin) are the components of the so-called curvature tensor asso- 
ciated to the connection V. 

The components of the curvature tensor came into play when we considered 
the mixed, covariant, partial derivatives of a vector field instead of just a smooth 
function. It is natural to ask whether some new quantity appears when one considers 
the mixed covariant partials of other tensors. Surprisingly, the answer is no. 
Theorem 6.4.3 (Ricci’s Identities). Let Nope be the components of a tensor field 
of type (r,s) over a coordinate patch of a manifold equipped with a connection V. 
Then the mixed, covariant, partial derivatives differ by 


as ays: Th ae 1Mta+1"" 
WT dhe Lae Soe = 2 Kin Tae 


= tye 
> it gis Hides Thay meer 


Over the coordinate patch U, the components of the curvature tensor satisfy the 
Bianchi identities. 


Proposition 6.4.4 (Bianchi Identities). With Ki,, defined as in (6.36) and T,, 
defined as in Equation (6.35), then 


I m 
Thm Tig 


I l I —! l 1 I I m l m 
De PS apt ER ag Ta ie Tg ae Tg Te ga a 


and 
l m l l l 
IG ea Kien, Gar K} ihe — Tye Bah Teh img — Tg ek 


imj 


The second Bianchi identity is also called the differential Bianchi identity. 


Proof. (Left as an exercise for the reader.) 


In particular, if V is a symmetric connection, the Bianchi identities reduce to 


first identity: — Kij, + Kjx; + Kiij =, (6.37) 
second identity: Klip, + Kinny + Kinyn = 0 (6.38) 


for any values of any of the indices. 

(We need to mention at this point that some texts vary in how they assign mean- 
ing to the various indices of the Riemann curvature tensor and tensors associated 
to it. Because of the antisymmetry properties of the curvature tensor, the variances 
only lead to a possible difference in sign between component functions alternately 
defined. Fortunately, the coordinate-free definition for the curvature tensors seems 
to be uniformly accepted across the literature.) 
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= fi R(X1,Y)Z4+ foR(X2,Y)Z. 


6.4.2 Coordinate-Free 


A second and more modern approach to curvature on a Riemannian manifold (M, g) 
defines the curvature tensor in a coordinate-free way, though still from a perspective 
of analyzing repeated covariant differentiation. If X, Y, and Z are vector fields on 
M, the difference in repeated covariant derivatives is 


Vie Vee. (6.39) 


Even with general vector fields in R”, (6.39) does not necessarily cancel out. How- 
ever, by Problem 6.2.6, Vx Vy Z = X(Y(Z*))dx, so 


VxVyZ —-VyVxZ =VixyZ- (6.40) 


This equality might not hold for all vector fields X,Y,Z on a manifold equipped 
with a connection V. This fact motivates defining the quantity 

R(X,Y)Z © VxVvyZ—-VyVxZ-VixyZ. (6.41) 
The notation R(X, Y)Z emphasizes the understanding that for each vector field X 
and Y, R(X,Y) is an operator acting on Z. At first glance, R(X,Y)Z is just a 
smooth mapping X(M) x X(M) x X(M) — X(M), smooth because the resulting 
vector field is smooth. However, more is true. 


Proposition 6.4.5. The function R(X,Y)Z defined in Equation (6.41) is a tensor 
field of type (1,3), which is antisymmetric in X and Y. 


Proof. The antisymmetry property follows immediately from [Y,X] = —[X,Y] 
and Definition 6.2.1. To prove the tensorial property, we need only to show that 
R(X, Y)Z is multilinear over C*°(M) in each of the three vector fields. We show 
linearity for the X variable, from which linearity immediately follows for the Y 
variable. We leave it as an exercise for the reader to prove linearity in Z. 

Let fi, fo € C@(M). Then 


RfiX1 + foX2,Y)Z = (AVx, + fpVx.)VyZ 
_ Vy (fiVxy an faV x2)Z _ Vite X1,Y14lfeXs,¥]2- 


By Proposition 5.3.4(4), [fiXi, Y] = fi (Xi, Y] _ Y (fi) Xi. Thus, 


R(fitX1 + foX2,Y)Z 
= fiVx,VyZ + foVx,VyZ — faVvVx,Z —Y(fi)V x, 2 


— foVyVx,2 — Y(f2)V x22 — Vatxyi-v( x2 — Viaike,YI-¥ (fe) X22 


=fiVx,VyZ— fiVyvVx.Z—- fiVix,yjZ + fpVx.VvZ — faVyVx.2 — faV [x,y] 


—Y(fi)Vxi4-Y¥(f2)VmZ+V(fi)VxiZ+Y¥ (f2)Vx.F 
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Definition 6.4.6. The tensor field R of type (1,3) satisfying 
R(X, Y)(Z) = VxVyvZ-VyVxZ-VixyjZ 


is called the curvature tensor associated to the connection V. Occasionally, this is 
denoted RY to explicitly indicate which connection. 


We connect this approach to the coordinate-dependent Definition 6.36 as follows. 
Let x be a coordinate system on a coordinate patch of M. By the C®(M)-linearity, 


R(X, Y)Z = X*Y!Z* R(O;, 0;) Or 


where X = X‘Q; and similarly for Y and Z. The components of R in local coordi- 
nates are Ri jk where 


Now since [0;, 0;] = 0, 


R(O;,0;)Ox = Va, Va; Ox — Va; Va, 
= Vo, (T,On) — Va, (P2,0n) 


or” orn 
cL WY ea ero 
Bar eee eae 


or! or! 
_ h pl h pl jk ik 
x? (rt - aes + Oxi a Oxi On, 


= T,.Va, On + On 


from which we obtain 


p= in _ 
wae Ox? Oxd 


h pl h pl 
el i Eiel pns 


which recovers exactly the coordinate-dependent Definition 6.36. 


6.4.3. Riemannian Curvature 


Our presentation of the curvature tensor so far applies to any affine connection. We 
turn to the specific example of a Riemannian manifold (M, g). 


Definition 6.4.7. The curvature tensor associated to the Levi-Civita connection 
associated to the metric g is called the Riemann curvature tensor, denoted R. 

As above, we denote the components of the curvature tensor by Rh iy: Since 
V is symmetric, the torsion tensor 7 associated to the Levi-Civita connection is 
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identically 0. In coordinate-free expression, the Bianchi identities for the Riemann 
tensor are 


R(X,Y)Z + R(Y, Z)X + R(Z, X)Y =0 (6.42) 
VwR(X,Y)Z+VzR(X,W)Y + Vy R(X, Z)W =0 (6.43) 


By contracting with the metric tensor g, we obtain a tensor field of type (0,4), 
which in components is 
Rijn = Imi Risin (6.44) 


We define this tensor also in a coordinate-free way. 


Definition 6.4.8. If R is the Riemann curvature tensor on a Riemannian mani- 
fold (M,g), then R’, which is commonly denoted Rm, is the Riemann covariant 
curvature tensor. In other words, for all vector fields, X,Y, Z,W on M, 


Rm(X,Y, Z,W) = g(R(X,Y)Z,W). 
We write the components of Rm with respect to a basis as Rijx- 


Not all the component functions of Rij , or of Ry;~1 are independent. We now 
wish to determine the number of independent component functions in Rjj,.;, which 


will be the same number of independent component functions of Ri ke 


By the definition from (6.36), we see that Rip = —Rip and, therefore, that 
Rijer = —Ryixt- (6.45) 
Furthermore, the first Bianchi identity gives 
Rijri + Rye + Reigt = 0. (6.46) 


The compatibility condition of the Levi-Civita connection is expressed in coordi- 
nates as gi;:k = 0 as functions for all indices i,7,k. This leads to another relation. 
Theorem 6.4.3 and the compatibility condition imply that 


0 = Gkisjxi — Gklsizg = — Rij Imi — Rij Gem (6.47) 


which is tantamount to 
Rizr = — Risk: (6.48) 


Equations (6.45) and (6.48) show that the covariant curvature tensor is skew- 
symmetric in the first two indices and also in the last two indices. Furthermore, 
this skew-symmetry relation combined with the identity in (6.46) leads to 


Rijnt = Retij- (6.49) 
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We can see from the following calculation, starting with the Bianchi identity, and 
using again in the middle: 
0 = Rigi + Raji + Roki = —Rinty — Regu + Rjir 
= Reig + Rung + Ryei + Rings + Kyiv 
= 2Reuz — Ruger — Byun + Rjixi from (6.48) and (6.45) 
= 2Reug + Rijtk + Ryimt = 2Revig — 2Rijr 


Equation (6.49) follows. 

We can now count the number of independent functions given the relations 
in (6.45), (6.46) and (6.48). There are five separate cases depending on how many 
indices are distinct. By virtue of (6.45), the cases when all indices are equal or when 
three of the indices are equal lead to identically 0 functions for the components of 
the covariant tensor. If there are two pairs of equal indices, then we must have 
Riijj = 0 while the quantities Rj;;; could be nonzero. In this case, the identities in 
(6.45), (6.46) and (6.49) explicitly determine all other possibilities with two pairs 
of equation indices from Rij;;. There are (és) ways to select the pair {i,j} to define 
Rijiz- If the indices have one pair of equal indices and the other two indices are 
different, then by (6.45) and (6.48), the only nonzero possibilities can be determined 
by Rijiz (where i, j, and k are all distinct). Hence, there are n("5') choices of 
independent functions here. Lastly, suppose that all four indices are distinct. All 
the functions for combinations of indices can be obtained from the relations, given 
the functions for Rjij~. and Riy;,. Thus, there are Ata) independent functions in 
this case. In total, the covariant curvature tensor is determined by 


(3) 033-20)» 


independent functions. 


Example 6.4.9. It is interesting to note that for manifolds of dimension n = 
2, there is only one independent function in the curvature tensor, namely, R212. 
Equation (7.47) in [5] shows that the Gaussian curvature of the surface at any point 
is equal to 

k= —Rj212/ det(gi;). (6.50) 


Because of cancellations for repeated indices, an elegant way to rewrite this gives 
us the components of the Riemann covariant curvature tensor: 


Rijn = K(gugjk — 9k 91): (6.51) 


Properties of the Riemann covariant curvature tensor presented in a coordinate- 
dependent manner have equivalent expressions in a coordinate-free formulation. 


Proposition 6.4.10. The covariant curvature tensor Rm satisfies the following 
symmetry properties for vector fields X,Y, Z,W, and T: 
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ANZ 


nie) 8 (ah + OH + eb) 


Figure 6.9: Geometric interpretation of Figure 6.10: Geometric interpretation of 


the torsion tensor. the curvature tensor. 
1. Rm(X,Y,Z,W) = —Rm(Y, X, Z,W). 
2. Rm(X,Y,Z,W) = —Rm(X,Y,W, Z). 
3. Rm(X,Y,Z,W) = Rm(Z,W, X,Y). 
4. Bianchi’s first identity: 


Rm(X,Y,Z,W) + Rm(Y, Z,X,W) + Rm(Z, X,Y,W) =0. 


5. Bianchi’s differential identity: 


VRm(X,Y,Z,W,T) + VRm(X,Y, W,T, Z) + VRm(X,Y,T, Z,W) =0. 


6.4.4 Geometric Interpretation 


Until now, we have not given an interpretation for the geometric meaning of the 
curvature or torsion tensors. 

Consider first the torsion tensor. (Of course, by definition, the Levi-Civita 
connection is symmetric and so the torsion is 0, but we give an interpretation for 
any affine connection.) We will use a first-order approximation discussion, following 
the presentation in [44, Section 7.3.2]. This reasoning differs slightly from a rigorous 
mathematical explanation, but we include it for the sake of familiarity with physics- 
style reasoning. 

Let p € M, with coordinates x" in a coordinate system on M. Let X = 6"0, 
and Y = ed, be two vectors in T,M. Let yx(t) be the curve with coordinate 
functions 6“t and let yy(t) be the curve with coordinate functions e“t. Consider 
the parallel transport of the vector X along y(t). The coordinates of the resulting 
vector are d4 +I 5,)e,. The coordinates of p’, the tip of the parallel transport of 
X, are 


/ 


p: #+eh4 Th One, 
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If we take the parallel transport of Y along yx(t), the coordinates of the resulting 
vector are e“ +T4 €)6,. The coordinates of p’’, the tip of the parallel transport of 
Y, are 

pi: +eh#+Th ed). 


The difference between these two parallel transports is ([\,, — I'’,)d,e,, which is 
Th,Ox€v. Therefore, intuitively speaking, the torsion tensor gives a local measure of 
how much the parallel transport of two noncollinear directions with respect to each 
other fails to close a parallelogram (see Figure 6.9). 

The curvature tensor, on the other hand, measures the path dependence of 
parallel transport. In the coordinate-free definition of the curvature tensor from 
(6.41), the expression Vx VyZ is a vector field that measures the rate of change 
of parallel transport of the vector field Z along an integral curve of Y and then a 
rate of change of parallel transport of this Vy Z along an integral curve of X. The 
expression Vy Vx Z reverses the process. 

As discussed in Section 5.2 in the subsection on Lie brackets (see also Figure 5.4), 
the successive flows of a distance h along the integral curves of Y and then along the 
integral curves of X do not in general lead one to the same point if one follows the 
integral curves of X and then of Y. Proposition 5.3.14 shows that [X, Y] is a sort of 
measure for this nonclosure of integral paths in vector fields. Subtracting Vix,y)Z 
from VxVyZ— Vy VxZ eliminates the quantity of path dependence of parallel 
transport on a manifold that is naturally caused by the nonclosure of “square” paths 
of integral curves in vector fields. 

Another perspective is to consider a vector Z based at p with coordinates x“ 
and look at the path dependence of the parallel transport along two sides of a 
“parallelogram” based at p and spanned by directions 6” and e#. Locally, ice., 
when 6“ and e are small, the parallel transport of Z from p to q = (a" + 6") to 
s = (x! + 64 + e#) produces a vector A’Z. Similarly, the parallel transport of Z 
from p to r = (a" + c#) to s = (a + 6" + €#) produces a vector A’ Z (see Figure 
6.10). The difference Z +> A’ Z — A’Z is a linear transformation defined locally at 
p that depends on the directions 6” and e". In fact, it is not hard to show that, in 
coordinates, the first order approximation in the variables 6 and ¢ is 


(A"Z — A'Z)! = Ri, 5 e*Z!. 


PROBLEMS 


6.4.1. Calculate the 16 component functions of the curvature tensor for the sphere S? in 
the standard (0, @) coordinate system. 


6.4.2. Prove Proposition 6.4.2. 


6.4.3. (a) Prove the first Bianchi identity in Proposition 6.4.4 using a coordinate- 
dependent approach. 


(b) Prove the first Bianchi identity in Proposition 6.4.10 using a coordinate-free 
approach. 
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6.4.4. 


6.4.5. 


6.4.6. 


6.4.7. 


6.4.8. 


6.4.9. 


6.4.10. 


6.4.11. 


6.4.12. 


(a) Prove the second Bianchi identity in Proposition 6.4.4 using a coordinate- 
dependent approach. 


(b) Prove the second (differential) Bianchi identity in Proposition 6.4.10 using 
a coordinate-free approach. [Hint: This can be long and tedious if done 
directly. Instead, since VRm is C*°(M)-multilinear, choose X,Y, Z,W,T 
to be coordinate basis vector fields. Also, to make the computations even 
easier, use the normal coordinate system.] 


Prove that the quantity R(X,Y)Z defined in (6.41) is C*°(M)-linear in the Z 
variable. 


A smooth family of smooth curves is a function c : (—e,¢) x [a,b] > M such 
that cs(t) = c(s,t) is a smooth curve in M for each s € (—<,¢). Note that by 
symmetry, cz(s) is also a smooth curve for each t € [a,b]. A vector field along c 
is a smooth map V : (—e,¢) x [a,b] + TM such that V(s,t) € Tes,z)M for each 
(s,t). Define the vector fields S and T on c by S = Osc and Qe, i.e., the tangent 
vectors to c in the indicated direction. Show that for any vector field V on c, 


D,DiV — DiDs = R(S,T)V. 


(This gives another geometric interpretation of the curvature tensor.) 


The Jacobi Equation. This exercise considers variations along a geodesic y. A 
variation through geodesics along yy is a smooth family of smooth curves c (defined 
in Problem 6.4.6) such that for each s, the curve c,(t) = c(s,t) is a geodesic and 
c(0,t) = y(t). The variation field V of a variation through geodesics along ¥ is 
the vector field along y defined by V(t) = (Osc)(0,t). Show that V satisfies the 
Jacobi equation 

D?V + R(V,4)¥ = 0. (6.52) 


Consider the 3-sphere S°, and consider the coordinate patch given by the parame- 
trization described in Problem 6.1.14. Calculate the curvature tensor, the Ricci 
curvature tensor, and the scalar curvature. 


Calculate the curvature tensor, the Ricci curvature tensor, and the scalar curva- 
ture for the Poincaré ball. (See Problem 6.1.12.) 


Consider the 3-torus described in Problem 6.1.2(b) with the metric induced from 
R*. Calculate all the components of the curvature tensor, the Ricci tensor, and 
the scalar curvature, given in the coordinates defined by the parametrization given 
in Problem 6.1.2(b). 


Consider the metric associated to spherical coordinates in R®, given by 
g= dr? +r? sin? do” +r'd¢?. 


(Note, we have used the mathematics labeling of the longitude and latitude angles. 
Physics texts usually have the @ and ¢ reversed.) Prove that all the components 
of the curvature tensor are identically 0. 


Consider the Riemannian manifold of dimension 2 equipped with the metric g = 
f(u+v)(du? + dv?) for some function f. Solve for which f lead to Rjkim = 0. 
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6.4.13. 


6.4.14. 


6.4.15. 


6.4.16. 


6.5 


Let R be the Riemann curvature tensor (defined with respect to the Levi-Civita 
connection). Prove that 


1 ( O9ix 0? Qik A git A? gi N r 
Resji 2 i= Ox'Ox! =I Ok — Ox Oak oe ( Boat Pak 


Conclude that in normal coordinates centered at p, the following holds at p: 
1 
Rijrt = 3 i A1gix — O:019;k — OjOngit + O:0ng;1). 


The Killing Equation. Let (M,g) be a Riemannian manifold and let X € X(M). 
Consider the function f- : M — M defined by 


fe(p) = ¥(€) 


where ¥ is the integral curve of X through p. Thus, the linear approximation of 
fe for small ¢ maps p = (2*) to the point with coordinates x’ + ¢X‘(p). Suppose 
that f- is an isometry for infinitesimal e. 


(a) Use a linear approximation in € on the change-of-coordinates formula for the 
metric g to show that g and X satisfy the Killing equation: 
ax! ax! 


OGij yk : = 
Bot X* + Sou + Fg = 0. (6.53) 


(b) Let V be the Levi-Civita connection. Show that the Killing equation is 
equivalent to the condition that (VX)’ is antisymmetric. In components 
related to a coordinate system, this means that 


Xi + Xj = 0, (6.54) 
where X; = Gir X”. 


Consider a covector field w on a Riemannian manifold (M,g). Suppose that w 
satisfies the covariant Killing equation (see (6.54)), ie,. wij +w;.; = 0. Show that 
along any geodesic y(s) of M, w(¥) is a nonzero constant. 


Show that if Rijxi + Rijxi = 0, then the covariant curvature tensor is identically 
0. 


Ricci Curvature and Einstein Tensor 


We finish this chapter with a brief section on various tensors associated to the 
Riemann curvature tensor. 
Since tensors of type (1,3) or (0,4) are so unwieldy, there are a few common 
ways to summarize some of the information contained in the curvature tensor. 
One of the most common constructions is the Ricci curvature tensor, denoted 
by Re or Ric. We tend to write Rj; instead of Re;; for the components of this 
tensor with respect to a coordinate system. The Ricci curvature tensor is Tr R, or 
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the trace (with respect to the first indices) of the Riemann curvature tensor. In 
coordinates, the components are defined by 


—_ pk __p km 
Ri; or] Riis cae Rrijm: 


By the symmetries of the curvature tensor, Rj; can be expressed equivalently as 
Rig = Rij = — RE = —9*" Ringm = — 9°" Rimsi- 


Proposition 6.5.1. The Ricci tensor Rc is symmetric. 


k 


Proof. We prove this within the context of a coordinate system. Since Ry; = Rez 


then 
ark, ark, 
a Axk Ox* 
In this expression, since the connection is symmetric, the first and third terms of 
the right-hand side are obviously symmetric in i and j7. The fourth term reUh, 
is also symmetric in 7 and 7 by a relabeling of the summation variables h and k. 
Surprisingly, the second term in (6.55) is also symmetric. 
By Problem 6.2.19, 


+TRT%, —U2,0%,. (6.55) 


Thus, 
ors oF a ar’ 
jk / / tk 
Ox? Ox*OxI nafdetg) Ox) Ox" eyidetg) Oxd 
Hence, all the terms in (6.55) are symmetric in i and j, so R;; = Rj; and the result 
follows. 


Definition 6.5.2. The scalar curvature function R is defined as the trace of the 
Ricci tensor with respect to g, i-e., 


S=Tt, Re=g" Ry. (6.56) 


Sometimes, texts use the letter R to denote the scalar curvature, but we have 
opted for the other common notation of S' so as not to be confused with the curvature 
tensor symbol. 


Example 6.5.3 (Ricci Tensor of a Surface). We observed in Example 6.4.9 that 
the covariant Riemann curvature Rm tensor for a 2-manifold depends on only 
one function, Rj212. Symmetry and antisymmetry properties of the tensor de- 
termine all the component functions from this one. Furthermore, we observed that 
Ri212 = —K det(gi;), where K is the Gaussian curvature of the surface, that arises 
in classical differential geometry. We can write this as 


Rijet = K(Gugjk — Gikgj1)- 
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By definition of the Ricci tensor, 
Ry =o" Rem = Kg" (Gumdig — Onigim) 
= Kg" gem9ig — 9” 9k5 Gim = K (6m gi — 07 Gim) 
_ K (29; = 9:4) = Kgi;. 


Hence, the Ricci curvature tensor is (locally) proportional to the metric tensor, by 
the factor of the Gaussian curvature. Furthermore, this implies that 


So for all surfaces, whether embedded in R? or not, the scalar curvature function is 
twice the Gaussian curvature. 


The scalar curvature function allows us to define the Einstein tensor, which is 

of fundamental importance. 
Definition 6.5.4. On any Riemannian manifold (M,g) the Einstein tensor G is 
the tensor of type (0,2) described in coordinates by 
1 
Gyup = Rup — Inv (6.57) 
where S is the scalar curvature. 

Since the Ricci curvature tensor and the metric tensor are symmetric, i.e., in 
Sym?(TM"*), then the Einstein tensor field is also symmetric. As we will see in 
Section 7.5, the Einstein tensor is of central importance in general relativity. From 
a purely geometric perspective, the Einstein tensor has the following important 


property. 


Proposition 6.5.5. Let G be the Einstein tensor on a Riemannian manifold. Then, 
using (6.25), 
divG = 0. 


In coordinates, this reads Gilg = (9°"Guv).a = 9° Guia = 9. 


Proof. The proof of this proposition follows from the differential Bianchi identity. 
For the Riemann curvature tensor, by Proposition 6.4.10(5), we have 


Rijktim + Rijim:k + Rijme = 0. 
Taking the trace with respect to g over the variable pair (i, 1), 
Rikim — Rimk + go Rijmbl = 0, 


where the trace operator commutes with the covariant derivative because of the 
compatibility condition of the Levi-Civita connection. Multiplying by g/* and con- 
tracting in both indices gives 


Sim — 9” Rimsk = g' Rimst = 0. 


304 


6. Introduction to Riemannian Geometry 


Relabeling summation indices and using the symmetry of g and Rc, we deduce that 
Sim — 297" Ringsk = 0. (6.58) 


But Git = 9" Ga = 9" Ruy — 3008, so 
1 
Gri = 9" Ruri — ou Sia = 9" Rua — 5 Sins 


and the vanishing divergence follows from (6.58). The last claim in the proposition 
follows from Problem 6.2.16. 


Of particular interest in Riemannian geometry and in general relativity are man- 
ifolds in which the Ricci curvature is proportional to the metric tensor. The corre- 
sponding metric is called an Einstein metric and the manifold is called an Einstein 
manifold. More precisely, a Riemannian manifold (M, gq) has an Einstein metric if 


Rce=kg (6.59) 


for some constant k € R. Taking the trace with respect to g of (6.59) and noting 
that Tr, g = dim M, we find that k must satisfy 


eee 
~ dim M’ 


(6.60) 


This leads to the following interesting property. 


Proposition 6.5.6. If (M,g) is an Einstein manifold, then the scalar curvature is 
constant on each connected component of M. 


In part because of Proposition 6.5.6, Einstein metrics continue to remain an 
active area of research not only because of their applications to physics but more 
so because of their application to possible classification theorems for diffeomorphic 
manifolds. The Uniformization Theorem, a fundamental result in the theory of sur- 
faces, establishes that every connected 2-manifold admits a Riemannian metric with 
constant Gaussian curvature. This in turn leads to a classification of diffeomorphism 
classes for surfaces. 

One could hope that, in parallel with surfaces, all connected higher-dimensional 
manifolds (dim M > 2) would possess an Einstein metric that would in turn lead to 
a classification theorem of diffeomorphism classes of manifolds. This turns out not 
to be the case. There do exist higher-dimensional compact manifolds that admit 
no Einstein metric ({9]). Nevertheless, in attempts to reach a generalization to the 
Uniformization Theorem for higher-dimensional manifolds, Einstein metrics play a 
vital role. 
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PROBLEMS 


6.5.1. 


6.5.2. 


6.5.3. 


Suppose that on a Riemannian manifold (M,g), the curvature tensor satisfies 
div R = 0, or in coordinates Rj,;,, = 0 for all j,k,l. Show that the following 
also hold: 


(a) Rijn = Rinsyi 

(b) S:5 = 0; 

(c) go” Rigi Rmh + 9" Rint Rmg + go” Rint Rme = 0. 
Some authors define an Einstein manifold to be Riemannian manifold such that 
the Ricci curvature tensor is proportional to the metric tensor in the sense that 
Re = Ag, where  : C°*°(M,R) is a smooth function on M. (The definition given 
in the text requires the \ be a constant.) Prove that if the manifold has dimension 
n > 3, then this alternate definition of an Einstein manifold also implies that the 


scalar curvature is constant on all connected components of the manifold. [Hint: 
Show that S,, = 0.] 


Let Gi a g'” Gix; where Gj, are the components of the Einstein tensor, and define 
Roi = GR ies Prove that 
Gt = 1 WK Re 
VK? 


Jj “4 JAM 


where we have used the generalized Kronecker symbol defined in (4.34). 


Taylor & Francis 
Taylor & Francis Group 
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CHAPTER 7 


Applications of Manifolds to Physics 


In the previous chapters, we set forth the goal of doing calculus on curved spaces as 
the motivating force behind the development of the theory of manifolds. Occasion- 
ally, we showcased applications to physics either in examples or exercise problems. 
Having developed a theory of manifolds, we now present five applications to physics 
that utilize this theory. Consequently, throughout this chapter, the motivation for 
topics is inverted as compared to the rest of the book: instead of starting from a 
mathematical structure and looking for applications to physics, we begin with con- 
cepts from physics and see how the theory of manifolds can provide a framework 
for the idea. Each section shows just the tip of the iceberg on very broad areas of 
active investigation. 

Section 7.1 explores how Hamiltonian’s equations of motion motivate the notion 
of symplectic manifolds. Because of these applications and fascinating properties, 
symplectic geometry has become a significant field. Historically, it was the Hamil- 
tonian formulation of dynamics that lent itself best to quantization and hence to 
Schrédinger’s equation in quantum mechanics. 

In special relativity, Einstein’s perspective of viewing spacetime as a single unit, 
equipped with a modified notion of metric, is properly modeled by Minkowski 
spaces. Section 7.2 discusses this, along with its natural generalization to pseudo- 
Riemannian manifolds. 

A few exercises in this text have dealt with the theory of electromagnetism. In 
Section 7.3, we gather together some of the results we have seen in the theory of 
electromagnetism and rephrase them into the formalism of a Lorentzian spacetime. 

We also discuss a few geometric concepts underlying string theory. Between 
1900 and 1940, physics took two large steps in opposite directions of the size scale, 
with quantum mechanics describing the dynamics of the very small scale and gen- 
eral relativity describing the very large scale. These theories involve very different 
types of mathematics, which led physicists to look for reformulations or generaliza- 
tions that could subsume both theories. However, despite extensive work to find 
a unifying theory, the task has proven exceedingly difficult, even on mathematical 
grounds. String theory is a model for the structure of elementary particles that 
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currently holds promise to provide such a unification. We wish to mention string 
theory in this book because, at its core, the relativistic dynamics of a string involve 
a two-dimensional submanifold of a Minkowski space. 

Finally, Einstein’s theory of general relativity, introduced in Section 7.5, stands 
as a direct application of Riemannian manifolds. In fact, general relativity mo- 
tivated some of the development and helped proliferate the notions of Rieman- 
nian (and pseudo-Riemannian) geometry beyond the confines of pure mathematics. 
Many of the “strange” (non-Newtonian) phenomena that fill the pages of popular 
books on cosmology occur as consequences of the mathematics of this geometry. 

This chapter assumes that the reader has some experience in physics but no more 
than a first college course (calculus-based) in mechanics. All the other material will 
be introduced as needed. We do not discuss issues of quantization as those exceed 
the scope of this book. 


7.1 Hamiltonian Mechanics 
7.1.1 Equations of Motion 


The classical study of dynamics relies almost exclusively on Newton’s laws of motion, 
in particular, his second law. This law states that the sum of exterior forces on a 
particle or object is equal to the rate of change of momentum, i.e., 


= dp 
>. Fst = a (7.1) 


where p = mdz/dt and £(t) is the position of the particle at time t. If m is constant, 
(7.1) reduces to 

ag 
Furthermore, by a simple calculation, (7.1) directly implies the following law of 
motion for angular momentum about an origin O: 


dL 7 
ie pa (7.3) 


where L = 7 x p is the angular momentum of a particle or solid, where 7 is the 
position vector of the particle or center of mass of the solid, and where }> 7.2; is 
the sum of the torques about O. (Recall that the torque about the origin of a force 
Fist=7xF.) 

Though (7.1) undergirds all of classical dynamics, the value of ancillary equa- 
tions, such as (7.3), arises from the fact that these other equations may elucidate 
conserved quantities or produce more tractable equations when using different vari- 
ables besides the Cartesian coordinates. For example, when describing the orbits 
of planets around the sun, polar (cylindrical) coordinates are far better suited than 


7.1. Hamiltonian Mechanics 


309 


Cartesian coordinates. In particular, as shown in Example 2.2.3, the angular mo- 
mentum is a conserved quantity for a particle under the influence of forces that are 
radial about some origin. 

It turns out that in many cases (in particular when the forces are conservative), 
either (7.2) or (7.3) follows from a specific variational principle that has extensive 
consequences. Suppose that the state of a physical system is described by a system 
of coordinates qx, with k = 1,2,...,n. Hamilton’s principle states that the motion 
of a system evolves according to a path P parametrized by (qi(t),...,@n(t)) between 
times t, and tz so as to minimize the integral 


te 
s= | ta=f Ldt (7.4) 
P ty 


where L is the Lagrangian function. The integral S is called the action of the 
system. When the system is under the influence of only conservative forces, the 
Lagrangian is L = T — V, where T is the kinetic energy and V is the potential 
energy. Recall that for a conservative force EF , its potential energy V, which is a 
function of the position variables alone, satisfies 


F=-VV= —grad V. 


Intuitively speaking, in the case of conservative forces, Hamilton’s principle states 
that a system evolves in such a way as to minimize the total variation between 
kinetic and potential energy. However, even if a force is not conservative, it may 
still possess an associated Lagrangian that produces the appropriate equation of 
motion. (See Problem 7.1.8 for such an example.) 

We consider the Lagrangian LF as an explicit function of t, the coordinates q,, and 
their time derivatives q, = dq,/dt. According to Theorem B.3.1 in Appendix B, 
the Lagrangian must satisfy the Euler-Lagrange equation in each coordinate qz, 


namely, 
OL d (OL 
pay So ee [eee 7.5 

Od x (ain) : 7.5) 


This is called Lagrange’s equations of motion. Though this system of equations 
moves away from the nice vector expression of (7.2), it has the distinct advantage 
of expressing equations of motion in a consistent way for any choice of coordinates. 
Example 7.1.1. Consider a ball (or cylinder) of radius R rolling down a plane 
inclined with angle a, as depicted in Figure 7.1. Because the object rolls instead of 
sliding, the rotation about its center leads to an additional kinetic energy amount 
of 31 62, where @ is the rate of rotation about its center. However, because there is 
no slipping, we deduce that « = R6, where x is the coordinate of the distance of the 
center of mass of the object up the incline. Thus, the Lagrangian of this system is 


1_. 1 
L= 51 + yw —mgh, 
I 
2R 


2 


1 
L(a,%) = 51a? + aime mgz sina. 
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Figure 7.1: A round object rolling downhill. 


The Euler-Lagrange Equation (7.5) gives 


OL d (3 


= : _ T 2 oe 
ame? =) mg sina — (I/R° + m)% = 0, 
which leads to the equation of motion 


d?x — gsing 
2. I ? 
dt mR2 oar 


a well-known result from classical mechanics. 


Though Example 7.1.1 involves a variable x that is essentially taken from R, 
physical systems in general may typically be described by other types of variables. 
When studying the motion of a simple pendulum (Figure 7.2), we use as a variable 
the angle @ of deviation of the pendulum from the vertical. A system that is a 
double pendulum (see Figure 7.3) involves two angles. 

If a physical system can be described by using n locally independent variables, 
then we say the system has n degrees of freedom. The set of all possible states of a 
physical system is a real manifold Q of dimension n, called the configuration space 
of the system. The variables (q,) that locate a point on (a coordinate chart of) the 
manifold Q are called the position variables. (Note, we will use the subscript indices 
for the position variables to conform to physics texts and literature on symplectic 
manifolds, though one should remember at this stage that they are contravariant 
quantities.) For example, the configuration space of the system in Example 7.1.1 
is simply Q = R, while the configuration space for the simple pendulum is Q = S! 
and the configuration space of the double pendulum is the torus Q = S! x S!. 

The time development of a system corresponds to a curve 7: t ++ (qz(t)) on the 
manifold, and the functions qi,..., Gn are the coordinates of a tangent vector along 
y in the tangent space TQ. 

Now the Euler-Lagrange Equation (7.5) is a system of second-order, ordinary, 
differential equations. We would like to change this into a system of first-order 
differential equations for two reasons: (1) many theorems on differential equations 
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Figure 7.2: Simple Pendulum. Figure 7.3: Compound Pendulum. 


are stated for systems of first-order equations and (2) it is easier to discuss first-order 
equations in the context of manifolds. We do this in the following way. 
Define the generalized momenta functionally by 


OL 
The quantities p;, are the components of the momentum vector, which is in fact an 
element of T)(4)Q*. We can see this as follows. Let W be an n-dimensional vector 
space, and let f : W — R be any differentiable function. Then the differential dfs 
at a point v € W is a linear transformation dfg : W — R. Thus, by definition of 
the dual space, dfy © W*. Consequently, the differential df gives a correspondence 
df :W > W* via 


of 
Ox? 


n 
vr > dfs = dx’. 
Taking W as the vector space T\,)Q, the momentum at the point 7(t) is the vector 
dL (4,) © Ty 4)Q*. Hence, we can think of the momentum vector p as a covector 
field along the curve 7 given at each point by dL 4,). 
Consider now the Hamiltonian function H defined by 


n 
H= So pede — L(g, -- 5Ons GQs>- wane 
k=] 


Since we can write the quantity g, in terms of the components pz, we can view the 
Hamiltonian as a time-dependent function on TQ*. Given any configuration space 
Q, we define the cotangent bundle T’Q* as the phase space of the system. If Q is 
an n-dimensional manifold, then T@Q* is a manifold of dimension 2n. 

The variables q; are now functions of the independent variables 


(ES Giisedey GR Dine% + Pn). 
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Taking derivatives of H, we find that 


OH. Oger a OL OGx 
Op; = Do Ps Dm Od OD; 


k=1 Opi k=1 
er OL OG, _ 
= 4 », @ x an; = qi, 


where each term of the summation is 0 by definition of py. Furthermore, note that 
Lagrange’s equation reduces to 0L/0q; = p;. Thus, taking derivatives with respect 


to qi, we get 
On 3 Og. (OL QOL Ou 
Ogi Og, \ OG XH Aix Iai 


| 
T 
k=1 


OL ” OL \ O4x 
a + - = ed 

Og », @ an) Ou 
ee oe . 


Therefore, given the definition in Equation (7.6), the Euler-Lagrange Equation (7.5) 
is equivalent to 


; OH 
dk = p> 
Op 
on (7.7) 
Pk gn 


This system of equations is called Hamilton’s equations of motion. They consist 
of 2n first-order, ordinary, differential equations in n unknown functions, each in- 
volving 2n variables, whereas Lagrange’s equations of motion consisted of n second- 
order, ordinary, differential equations in n unknown functions. 

For simple dynamic systems, the kinetic energy T’ is a homogeneous quadratic 
function in the variables q¢,. If this is the case, then it is not hard to show that 


So Gepe = 2T, (7.8) 
k=1 


where T is the kinetic energy. If in addition, the forces acting on the system are 


conservative, then 
H=2T-(T-V)=T+YV, 


which is the total energy of the system. 


Example 7.1.2 (The Spherical Pendulum). As a longer example that compares the 
Lagrange and the Hamilton equations of motion, consider the spherical pendulum 
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Figure 7.4: Spherical pendulum. 


as shown in Figure 7.4. This classical problem consists of a point mass that is 
hanging from a string and is free to move not just in a vertical plane but in both its 
natural degrees of freedom. We label the mass of the object at the end of the string 
as m and the length of the string as /. For simplicity, we assume that the mass 
of the string is negligible and that there is no friction where the string attaches at 
a fixed point. This scenario is called the spherical pendulum problem because the 
same equations govern the motion of an object moving in a spherical bowl under 
the action of gravity and with no (negligible) friction. 

We use a Cartesian frame of reference, in which the origin is the fixed point to 
which the string is attached and the z-axis lines up with the vertical axis that the 
string makes when at rest and hanging straight down. Furthermore, we orient the 
z-axis downward. With this setup, the degrees of freedom are the usual angles 0 
and y from spherical coordinates. To obtain the Lagrange equations of motion, we 
need to first identify the kinetic energy T and potential energy V. 

The velocity vector for the particle moving at the end of the string is 


v =I(Gcosycos 4d — OsinysinO, dcos y sin + 6 sin y cos 4, —gsiny), 
so after simplifications, the kinetic energy is 
1 é 
ie gle + 6? sin? yy). 
The potential energy is V = mgl(1 — cosy), so the Lagrangian is 
1 : 
L= gm (e + 6? sin? y) — mgl(1 — cosy). 
The Lagrange equations of motion are 

d (OL OL d (OL OL 
iQ)-% mt 3-2 
dt \ 6 oo 
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which in this specific example give the system of differential equations 


d . 

a (ml? sin? y) =0 

a (ml?) = ml?6? sin ypcosy — mglsin y. 

As we try to extract a more convenient set of equations that govern this system, 
we could take the derivative on the left hand side of the first equation. However, 
it is more useful to see that pp = OL/00 = ml?6sin” y is a constant. This is the 
é-momentum. Then we can write the second equation as 

¢ = sin 9(6? cosy — 7): 
Since pg is a constant, we can solve for 6 in terms of po and write the second equation 
only in terms of y to get the following system of equations: 


po = ml26 sin? Y, 

pa cosp g., (7.9) 
—3z— — = sing. 

ml4 sinky 


p= 


These are still essentially the Lagrange equations of motion, with the understanding 
the pg is constant. 

It might appear that the sin®y in the denominator in the second equation in 
(7.9) could be a cause for concern at y = 0 but it is not, as we now explain. Recall 
that pe is constant. If pg = 0, then the second equation in (7.9) does not possess a 
singularity at y = 0. On the other hand, if pg 4 0, then siny 4 0 so ¢ is never 0. 

In order to solve the equations of motion, we first solve the equation that involves 
only y(t). Once we know y(t), we find 6(t) by integrating 


_ Pe 
mil? sin? (p(t) ’ 


using the fact that pg is a constant. 

To establish Hamilton’s equations of motion, we first find the generalized mo- 
menta of the coordinates as pg = ml?6sin? y and py = ml?p. To get the Hamilto- 
nian of this system, we first point out that the momenta give us values for 6 and Y. 
Hence 


H =p + ppp —L 
2 ‘ . 
PG Pp 1 2 ( Py ) Po 2 1 
= ‘ 1 
ml? sin? py mi? on ml? ml? sin? y sin* p ] + mgl(1 — cos y) 


_ Ps Po 
Qml? sin? p 2m? 


+ mgl(1 — cosy). 
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In this example, we see that the Hamiltonian is indeed the total energy T+ V. 
Then Hamilton’s equations (7.7) for this system are 


po =0 
2 
cos 
Dp = —mglsiny + eee 
ml? sin’ yp 

6 = ee 

m2 sin? y 
mie 


7.1.2 Symplectic Manifolds 


We now introduce the notion of a symplectic manifold and show how Hamilton’s 
equations of motion arise naturally in this context. The theory of symplectic geome- 
try is a branch of geometry in and of itself so we do not pretend to cover it extensively 
here. Instead, we refer the reader to [8] or [1] for a more thorough introduction. In 
this section, we simply illustrate how the theory of manifolds, equipped with some 
additional structure, is ideally suited for this area of mathematical physics. 


Definition 7.1.3. Let W be a vector space over a field K. A symplectic form is a 
bilinear form 
wi:VxVok 


that is: 

1. antisymmetric: w(v,v) = 0 for all v € W; 

2. nondegenerate: if w(v,w) = 0 for all w € W, then v = 0. 
The pair (V,w) is called a symplectic vector space. 


Proposition 7.1.4. Let (V,w) be a finite-dimensional, symplectic vector space. 
There exists a basis B of V relative to which the matrix of w is 


[w]e = e °) 


where I, is then x n identity matrix. In addition, V has even dimension. 


Proof. (Left as an exercise for the reader. See Problem 7.1.4.) 


Since the form w is antisymmetric and bilinear, then w € i V. Suppose that 
V has a basis B = {e1,...,€2n}, and let B* = {e7,...,e3,,} be the associated dual 
basis (see Section 4.1). Then in coordinates, we can write w as 


w= ) WO; A es, 
1<i<j<n 


However, from Proposition 7.1.4 follows immediately a nice corollary. 
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Corollary 7.1.5. Let (V,w) be a symplectic vector space of dimension 2n. Then 
there exists a basis B = {e1,...,€2n} such that w can be written as 


n 
w= ) GC, Ne, igs 
i=1 


The expression in Corollary 7.1.5 is called the canonical form of the symplectic 
form w. 


Definition 7.1.6. A symplectic manifold (M,w) is a smooth manifold M equipped 
with a 2-form w that is closed (dw = 0) and nondegenerate. In other words, M is a 
smooth manifold such that for each p € M, T,M is a symplectic vector space with 
symplectic form w, and w, varies smoothly with p. 


By Proposition 7.1.4, one sees that a symplectic manifold has even dimension. 


Definition 7.1.7. If (M,w) and (M,&) are two symplectic manifolds, then a 
smooth map F’': M > M is called symplectic if 


F*j=w. 


We say that F preserves the symplectic structure. If in addition, F~! is also a 
smooth symplectic map, then F is called a symplectomorphism. 


Darboux’s Theorem, a fundamental result in the theory of symplectic manifolds, 
establishes that given any two symplectic forms w and w such that wp = wp at 
some point P € M, there exists a neighborhood U of P and a diffeomorphism 
F:U + F(U) C M such that F(P) = P and F*® = w. (We refer the reader 
to [8, Section 2.2] for a proof.) Darboux’s Theorem is equivalent to the following 
formulation. 


Theorem 7.1.8. Let (M,w) be a symplectic manifold. For each point P © M, 
there exists an open neighborhood U of P and a symplectomorphism F of U onto 
F(U) C R?” such that (F~1)*w takes the canonical form in R?”. 


As a consequence of this theorem, at every point P € M, there exists a coordi- 
nate neighborhood U of P with coordinates x in which 


n 
w= S- dx; \ din+i- 


i=1 


The formalism of symplectic manifolds applies to Hamiltonian mechanics in the 
following way. Consider the configuration space Q for a physical system. Sup- 
pose that Q is a manifold of dimension n. The cotangent bundle M = TQ* isa 
manifold in itself of dimension 2n. If U is a coordinate neighborhood of Q with 
coordinates (q1,..-,@n), then U= nm 1(U) is a coordinate neighborhood for the 
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manifold TQ*, where 7 : TQ* > Q is the bundle projection map. The quantities 
(d1,-++;Qn;P1;---;Pn) Of position coordinates and corresponding generalized mo- 
menta form a coordinate system on Us By the identification of TR” ~ R”, it is 
not hard to show that TM = T(TQ*) = TQ @TQ*. 


Proposition 7.1.9. The 2-form defined over a particular coordinate patch x—'(U) 
by 


n 
w= So dq A dpi (7.10) 

i=1 
extends to a 2-form w € ?(TQ*) over the whole phase space TQ*. Furthermore, 
it is defined in exactly the same way as in Equation (7.10) over every coordinate 
patch on TQ* obtained as x~1(U), where U is any other coordinate patch of Q. 
Consequently, the form w endows TQ* with the structure of a symplectic manifold. 


Proof. Let F: UNU + UU be a coordinate transformation from (q;) to (q) 
coordinates, and let G: t~!'(UNU) + m—!(UNU) be the corresponding coordinate 
transformation from (q,p;) to (Gi,p;) on TQ*. Since p; are coordinates in the 
cotangent space, the differential of G has coordinate functions 


0%: 


0 
_ | 0q; _ ( (dF) 0 
[dG] = J Die ={ 0 [aF}-2}° 
0 j= 
On 
In particular, we deduce that 
_ 0G; _ _ Ok 
i= ——dq; 4 = ~—Adpp. 
dq ag and = dp DG PF 
Thus, 
(0G Odk “(9G OG: 
a io a. ~—d = pyaar ; d 
ar Nap = 3 (Saas (Seed) Ee (e 2 265), dy 


n n 


n 
= djkdq; A dpp = ys dq; \ dp;. 
j=l k=1 j=l 


The Hamiltonian function H is a smooth function TQ* — R. We define the 
Hamiltonian vector field Xz as the unique vector field that satisfies 


ix, = dH, (7.11) 


where zx,, is the contraction operator ix,,, which on forms is equivalent to the 
interior product (see Problem 5.4.16). Specifically in this case, ix,,w is the 1-form 
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Figure 7.5: A point mass sliding off a hemisphere. 


SS 


defined by ix,,w(Y) = w(XwH,Y) at all P © M and for all Y € X(M). It is not too 
hard to show that in coordinates of T(TQ*), the vector field Xz is 


Gigs Ee aye a) 


Proposition 7.1.10. A curve y on the phase space TQ* is an integral curve of 
the vector field Xq if and only if in each coordinate system the components y(t) = 
(q(t), pe(t)) satisfy Hamilton’s equations of motion from Equation (7.7). 


Proof. As a vector field on the curve ¥, the derivative +(t) is written in coordinates 
as 


” 3) ” ) 
y(t) = lin 4a. 7.13 
4(t) d 45g, + LP Bp, (7.13) 
By Equation (7.12), the Hamiltonian vector field Xy at points along the curve is 
expressed in coordinates as 


n 


(XH) 4(t) = S- 


i=l 


0 
y(t) OD; 


OH 3) “. OH 
pee on es 


(7.14) 


i=l 


The proposition follows by identification of Equations (7.13) and (7.14). 


In other words, Proposition 7.1.10 states that a solution to Hamilton’s equations 
of motion corresponds to a curve 7(t) in the phase space T@Q* such that 


V(t) = (Xw)t): 


Because of the importance of this formulation, it has its own terminology. If (1/7, w) 
is a symplectic manifold and H € C™(M), then with Xy defined by Equation 
(7.11), the triple (M,w, Xz) is called a Hamiltonian system. 


PROBLEMS 


7.1.1. The special orthogonal group in R?, denoted SO(3), consists of all 3 x 3 matrices 
that are orthogonal and have a determinant of 1. Explain why the configuration 
space of the position and orientation of a general solid in Euclidean three-space 
is Q = R® x SO(3). Explain why SO(3) is diffeomorphic to RP*. 
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7.1.2. 


7.1.3. 


7.1.4. 


7.1.5. 


7.1.6. 


7.1.7. 


Determine the Lagrangian, Lagrange’s equations of motion, and Hamilton’s equa- 
tion of motion for a point mass m sliding off a hemisphere of radius R. (See 
Figure 7.5.) 


Determine the Lagrangian, Lagrange’s equations of motion, and Hamilton’s equa- 
tions of motion for an elastic pendulum: a particle of mass m attached to a 
(massless) elastic string of elasticity constant k and unstretched length £. 


Determine the Lagrangian, Lagrange’s equations of motion, and Hamilton’s equa- 
tion of motion for the coupled harmonic oscillations depicted below: 


Ly oe) 
LOE CON 
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Use x1 and 22 as the displacement from where the masses labeled mi and m2 are 
in equilibrium. Assume that there is no friction on the ground. For simplicity, 
also assume that when the masses are in equilibrium, all three springs are relaxed. 


Consider the motion of the earth around the sun. Placing the sun at the origin, use 
polar coordinates (r,@) to locate the center of the earth with respect to the sun. 
The force of gravity of the sun acting on the earth has a potential energy function 
of V(r) = —-GMsMz/r, where G is Newton’s universal constant of gravity, Ms 
is the mass of the sun and Mz is the mass of the earth. Take into account the 
fact that the earth rotates on its own axis. Use the additional angle w to orient 
the earth around its axis. Write down the Hamiltonian function for this system, 
taking into account earth’s rotation. Show that, despite the fact that the rotation 
of the earth affects the Hamiltonian, the rotation does not affect the motion of 
the earth around the sun. 


Suppose that Q is the configuration space for a physical system involving a particle 
of mass m, and suppose that Q is a Riemannian manifold with metric g = (, ). 
Then the kinetic energy of a particle traveling along a curve y(t) is 


1 


T = 5m(y(t),7t))- 


(a) Consider the sphere S? of radius R, and use the coordinates (0,¢). Write 
down the Lagrangian, the Hamiltonian, and Hamilton’s equations of motion 
of a particle of mass m affected by a potential V = f(6,¢). 


(b) Let Q be any Riemannian manifold with metric g and with the associated 
Levi-Civita connection. Show that if the potential V is constant, then a 
solution to Hamilton’s equations of motion defines a geodesic on Q. 


Friction is a non-conservative force. Suppose that an object of mass m with 
motion in one space variable x(t) is affected by conservative forces with a combined 
potential energy function V(x, t) and the force of friction of F = —y, where ¥ is 
a positive constant. Prove that 


Laem (Sma - v) 
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7.1.8. 


7.1.9. 
7.1.10. 


7.1.11. 


7.1.12. 


7.1.13. 


is such that the Euler-Lagrange equation (7.5) leads to the correct equation of 
motion. Calculate the Hamiltonian associated with this Lagrangian, and write 
down Hamilton’s equations of motion. 


Classical electromagnetism. Consider a charged particle of mass m and charge e 
under the influence of a static electric field E and magnetic field B. The non- 
relativistic theory of electromagnetism [46] states that the force applied to the 
particle is 


2 a. 1 5 
F=e(E+-v~x B), 
c 
where U = dZ/dt is the velocity vector of the particle and c is the speed of light. 
(The presence of c is a mere scaling factor due to the choice of units.) The electric 
field is induced from an electric potential ¢ so that E = —Vd¢. The magnetic 
force, however, is not a conservative force. Show that the Lagrangian 


L = jv? +66 + oA 


ale 


yields Newton’s equation of motion from Equation (7.2), where A is the vector 
potential satisfying B = V x A. Show that the Hamiltonian of this system given 
in coordinates (x;, pi) is 


a 1 = e 
H(#,p) = Se + ps + p3) — ed(@) Tg Pit + p2A2 + psAs). 


Prove Proposition 7.1.4. 


Let V be a vector space of dimension 2n, and let w be any bilinear form on V. 
Show that w is nondegenerate if and only if w” =wA---Aw is nonzero. 


Let (V,w) be a real symplectic vector space. Let B = {e1,--- ,€2n} be a basis of 
V that gives w a canonical form. 
(a) Show that if a linear transformation T : V — V leaves the form invariant, 
1.€., 
w(T(#), T(w)) = w(U, w) for all 0, € V, 
then the matrix A of T with respect to the basis B satisfies 
TA=7. Glee f= e a 
(b) Suppose that T leaves w invariant. Show that if \ is an eigenvalue of T with 
multiplicity k, then 1A, A, and 1/) are also eigenvalues of T’ with multiplicity 
k. 
An alternative way to define the Hamiltonian vector field X7 involves using the 


process of raising indices as defined in Equation (6.11) in Section 6.1. Show that 
Xi = dH", relative to the canonical form w on TQ*. 

Prove Equation (7.12). [Hint: Use the embedding of A?TQ* in TQ* ® TQ* given 
by dq: A dp; = dq; ® dp; — dp; ® dq;.] 
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7.1.14. Let Q be a configuration space and let M be the associated phase space M = 
TQ*. Let « : TQ* > Q be the canonical projection. Define the Liouville form 
3 €'(M) by 
Dm(X) SF Ag (datm(X)) 
for any point m = (q,Aq) of the phase space M and for any vector X € TmM. 


(a) Using the standard coordinates on 7~'(U) in TQ*, where U is a coordinate 
patch of @, show that the Liouville form has the expression 


w=1 


(b) Conclude that the canonical symplectic form on TQ* satisfies w = —dv. 


7.1.15. Poisson Bracket. Consider the phase space M = TQ* for a configuration space 
Q. Define the Poisson bracket {, } on the function space C™(T'Q*) by 


_ (af dg _ Of ag 


i=l 


(a) Show that {, } is a differential in each entry, ie., {fi fo,g} = {fi,g}fo+ 
fi{fe,g} and similarly for the second entry. 

(b) Prove that {,} gives C*°(T'Q*) the structure of a Lie algebra, i.e., {, } 
satisfies the first three items of Proposition 5.3.4. 

(c) Show that Hamilton’s equations of motion from Equation (7.7) are equivalent. 


to 
Ge ={qe,H} and pr = {pr, 1} fork =1,...,n. 


7.2 Special Relativity; Pseudo-Riemannian Manifolds 
7.2.1 Concepts from Special Relativity 


In Chapter 2 we discussed why, in classical mechanics, it is not proper to assume 
the existence or the possibility of finding an absolutely fixed frame. However, one 
of the foundation principles of classical physics, namely the principle of inertia, also 
known as Newton’s First Law of Motion, affirms that a body with no net force 
acting on it moves with constant velocity (or stays at rest, which corresponds to 
zero velocity). However, if an observer is in a moving frame, by virtue of that 
movement, the observer may see a particle with no net force have an acceleration. 
Some authors refer to this effect as inertial forces, which are not true forces but 
only exist because of the motion of the observer. This leads to the concept of an 
inertial frame as one in which a particle with no net force acting on it appears to 
move in a straight line. 

The “principle of relativity” in classical mechanics states that the laws of dy- 
namics are the same in all inertial frames. We saw in Section 2.2 that if a frame 
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F = (O, é1, &, &3) is inertial, then another frame F’ = (O’, fi, fo, f3) is also inertial 
if the origin of F’ travels along a line b + vt in reference to F and where the or- 
thonormal set of vectors of F is a fixed rotation from the orthonormal set of vectors 
in F. It is a standard result in geometry that a direct isometry f : R°? — R® has 
the form f(#) = AZ +, where A € SO(3) and 6 is any fixed vector in R?. This 
direct isometry corresponds to OO" = band a = Aé; for i = 1, 2,3. Consequently, 
the frame F’ differs from F by a fixed direct isometry composed with a translation 
by wt, which corresponds to movement along a fixed velocity vector, which could be 
0 if F’ is stationary. 
Of particular interest, the change of coordinates 


g=a-vt, y=y, 2w#=2, (7.15) 


where v is a constant velocity, preserves the inertial property of frames. This is 
called the Galilean transformation. It corresponds to an observer in the frame 
F' moving at a constant speed v along the z-axis. with all other basis vectors 
between frames staying the same. The laws of mechanics expressed in one system 
of coordinates will be the same when expressed in the other. If P, and P2 are two 
points in space with coordinates (21, y1, 21) and (x2, ya, z2) in the frame F then the 
coordinates in the frame F’ are (},y{, 2) and (x,y, 25), which could very well 
be different. The coordinates with respect to a frame are not a physical quantity 
in that no law of mechanics will depend on the specific value of the coordinates. 
However, if we denote Av = x2 — 2, and likewise for the other coordinates, then 
the distance P; P2 is preserved between inertial frames: 


def 


As = V(Az)? + (Ay)? + (Az)? = V/(Aa’)? + (Ay’)? + (Az!) (7.16) 


So distance As between points is a physical quantity, independent of inertial refer- 
ence frame. 

Consider the situation of passengers on a plane. When the plane is sitting 
stationary on the tarmac the passengers will observe all the laws of physics to be 
the same as if they were not on the plane. At that point, a frame F’ fixed to the 
plane is an inertial frame since we will assume that a frame ¥ fixed to the Earth 
is inertial. When the plane is at altitude and cruising speed, and not effected by 
turbulence, except for the sound of the engines, there is no experiment that can be 
done internal to the plane that would allow a passenger to discern that it is moving. 
However, while the plane accelerates during take-off it is not an inertial frame: if a 
passenger drops an object, it will not fall straight toward the ground even though 
the only non-negligible force acting on it is gravity, which is vertical. 

Classical mechanics implicitly treats the notion of time as absolute and inde- 
pendent of any frame. For centuries, no one could imagine anything different. To 
be more precise, in order to record events in different frames, we must use the space 
variables which come from the geometric frames but also a time variable. So we 
must imagine a clock attached to each frame. By calling time absolute, we mean 
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that the only possible difference between clocks in classical frames is that they may 
have t = 0 corresponding to different points in time. In particular, suppose we use 
t and t’ as the time variables in the frames F and F’. If events P, and P2 occurs 
as ty and ty in frame F¥ and at t/, and t in frame ¥’, then 


At= At’. 


So though the recorded point in time is not a physical quantity, an interval of time 
At is. In modern explanations of the Galilean transformation (7.15) it is common 
to add the equation t = t’, though physicists working before special relativity would 
never thought of needing to write this. 

Through the 19th century, experiments on the nature of space, light, electro- 
magnetism, and ether (the hypothetical medium through which it was thought that 
light propagates like sound through air) produced unexpected results that began 
to call into question even these fundamental perspectives on the nature of space 
and time. Einstein’s theory of special relativity resolved these observations by us- 
ing modern developments in mathematics by reformulating the notion of spacetime 
according to two postulates: 


Postulate 1 The laws of electrodynamics and optics are valid in all reference 
frames in which the laws of mechanics hold (inertial frames). 


Postulate 2 Light is always propagated in empty space with a definite velocity c 
that is independent of the motion of the emitting body. 


These principles bear out the surprising but experimentally observed fact that 
distance As as defined in (7.16) and At change between inertial frames and this 
change is particularly evident when v is large. Suppose that we locate an event in 
a frame F using (t,x, y, z) time and space variables and similarly for another frame 
F'. Using work by Minkowski, Lorentz and others, Einstein showed that Postulate 
2 implies that if F’ is moving at a constant speed in the direction of the x-axis 
of F and if the unit basis vectors in each frame are the same, then the Galilean 
transformation should be replaced with 


t! 7 = 0 0\ /t 
i 1 
a Ss, | ye FOS. i . where y = (7.17) 
- 0 0 1 0 y 1 v 
6 0 0 oO 1 & 2 


We sometimes write y(v) to indicate the dependence of y on the magnitude of the 
velocity. In general, a Lorentz transformation is any change of coordinates from 
(t,x, y,2) to (t’,2’, y’, 2’) that consists of compositions of transformations in (7.17) 
and rotations in the space variables. 

Here are a few surprising consequences. 
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e Contraction of length. Suppose that P; = (t1, 71, yi, 21) and P2 = (te, xa, ya, 22) 
are two events in frame F with t; = ta, yy = yo, and z1 = z2. Then the length 
P,P» is Ax and P,P, is a segment along the direction of travel. Then (7.17) 
implies that 


Ag’ = xs _ xi, = (yx vyte) (yr1 vytr) = yAz. 


Then Az’ is the length between P, and P 2 as seen in the frame F’. So while 
an observer in frame F’ sees as a segment of length Lp = Az’, the observer 
in frame F will it see as having length 


ag! 


2 
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e Loss of simultaneity. Consider the same two points as above. In frame F they 
are simultaneous because t, = tg. However, 


Ar 


At=th-tl= (t2 52) (vt 771) = Ae. 


Thus, an observer in frame F’ does not view P,; and P2 as occurring simulta- 
neously. 


e We do also note that if P, and P: were events with only a difference in their 
y coordinates, with the relative motion still along the x-axis, then Ay’ = 
Ay. Hence, there is no observed contraction of length perpendicular to the 
direction of motion. 


e Finally, the formulas for the Lorentz transformation imply that no particle 
can move faster than the speed of light c. 


Though distances and time intervals are not preserved across inertial frames, 
the Minkowski line element given by 


(As)? & —¢2(At)? + (Az)? + (Ay)? + (Az)? (7.18) 


is preserved by any Lorentz transformation. Consequently, the postulates of special 
relativity require us to jettison the assumption that time and space coordinates are 
independent of each other. This perspective leads to the mental model of spacetime. 
A point in this spacetime is called an event and we use coordinates (t,z,y,z) with 
respect to some frame. 

Scaling the right-hand side of (7.18) by any factor still gives us a quantity that 
is preserved by Lorentz transformations. The choice of signs reflects the fact that 
if At = 0 in some frame, then As is precisely the usual distance between points in 
that frame. The proper time interval Ar between two events in spacetime is 


1 


(Ar)? S —(As)? = (? (At)? — (Aa)? — (Ay)? — (Az)?). 
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spacelike 


Figure 7.6: Light cone. 


In this context, it is not as natural to talk about the “trajectory” of a particle, 
since this term usually assumes that the space variables are expressed as a function 
of time. In contrast, we can still model the motion of a particle by parametric 
equations £(A) = (t(A), z(A), y(A), 2(A)) for some parameter 4. 


Definition 7.2.1. If a particle has the property that for all \ in some interval 
[\1, A2], the particle exists in space time at @(A), then the image of this curve is 
called the world line of the particle. 


If we wish \ to carry some sense of moving forward in time, we simply impose 
the assumption that dt/d\ > 0. Then the rate of change dr/dA of the particle’s 
proper time with respect to A satisfies 


dr\* _(dt\* 1 (dx\? 1 fdy\? 1 (dz ’ 
dy} ~~ \da c2 \ dX c \ dr ce \d\/) © 
Using chain rules so that da/dd = (da/dt)(dt/dX) and simplifying by dt/d, we get 
dr\? 1 1 dx\* _ (dy _ (a ' 
dtp" VN dep Nd > Nat 


So if in a frame F a particle has a velocity vector function of v(t), then 


{v2 dt 
dr =4/1— 2 dt and a y(v). (7.19) 
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The quantity dr is the proper time differential and the function 


T= i -s (7.20) 


is called the proper time of the particle traveling on its world line. 

Proper time plays a central role in the theory of relativity since it is the same for 
all inertial observers, i.e., unchanged by any Lorentz transformation. Furthermore, 
this reminds us of the habit in elementary differential geometry to consider the 
parametrization of a curve by arclength: since (7.20) defines a function 7(A) such 
that dr/dA > 0 so an inverse A(T) exists; using this function we can reparametrize 
a particle’s world line by proper time. The proper time function defined in (7.20) 
also emphasizes that the proper time between two events is the time ticked off by 
a clock which actually passes through both events.[50] 

Suppose we have two events with a given Minkowski metric As? between them. 
They are called 


e timelike separated if As? < 0. This means that Ar? > 0. Clearly, for two 
timelike separated events time must have elapsed. Also, since a particle cannot 
travel faster than the speed of light, any two events on the world line of a 
particle must be timelike separated. From another perspective, two events 
are called timelike separated if a particle can travel between them (without 
moving faster than the speed of light). 


e lightlike separated if As? = 0. Only a particle traveling in a straight line at 
the speed of light can connect two lightlike separated events. 


e spacelike separated if As? > 0. No particle can have a world line that connects 
two spacelike separated events. [60, Section 2.2] 


The light cone based at an event P is the set of all events that are lightlike separated 
from P. Figure 7.6 shows the light cone for the origin, though we can only display 
the variables (t, x, y). 


7.2.2 Minkowski Spacetime 


Euclidean geometry takes place in the inner product space (R”,-), where the inner 
product is the dot product. In Chapter 4 we considered properties of vector spaces 
equipped with other bilinear forms. In particular, in Example 4.3.12 we already 
saw the following space. 


Definition 7.2.2. We define (n + 1)-Minkowski spacetime as a real vector space 


with coordinates (x°,z!,...,a”) equipped with the bilinear form 7 defined by 


7= mda’ ® dx 
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with coefficients 


The bilinear form 7 is called the Minkowski metric. We denote the Minkowski 
spacetime by R™!. 


For simplicity of notation, we will write @- 6 instead of n(@,b). 
This vector space is suited for special relativity because we can set 


(2°, 2', 2”, 2°) = (ct, x, y, 2). (7.21) 


Then under this coordinate change, the Minkowski line element in (7.18) corre- 
sponds to 


n(G, @), where @ = Ay 


Az 


We can think of the difference between the (t, x, y, z) coordinates and the (x°, x', x7, x 


coordinates as a change of units, so that in the (x') system the speed of light is 1. 

On the other hand, because of the centrality of the speed of light, especially since 
we postulate that it is the same for every observer, we might as well use a system 
of units in which it is 1. Many people are familiar with the light-year: applied to 
time a light-year is a usual year and when applied to distance, it means the distance 
traveled by light in a year. Or we could use the unit of meter: when applied to 
time, 1 m of time refers to how long it takes for light to travel 1 meter. Since in the 
SI (international system) c = 3 x 10°m/s, the conversion between a meter of time 


and a second of time is 1 


"= 3x 108» 
This convention of units is common among specialists in general relativity but is 
not universal throughout other branches of physics. Consequently, this text refrains 
from using this convention of units. So when applying Minkowski space to special 
relativity, we continue to assume x° = ct. (In doing so, we hope that specialists will 
not be put off and that non-specialists will not be confused.) 

In Example 4.3.12, we determined the automorphisms of the Minkowski metric. 
When c = 1, we had found precisely the format of Lorentz transformations as 
in (7.17), except that in (4.26) we had found a few possible differences of signs 
in €; = +1 and eg = +1. These signs do not appear in (7.17). Consequently, 
the allowed transformations between inertial frames in special relativity, namely 
Lorentz transformations, correspond to the restricted Lorentz transformation group 
discussed in Example 4.3.12. Over Minkowski space, this group of transformations 
plays a parallel role to the group of direct isometries in Euclidean geometry. 

An inner product space, defined in Definition 4.2.11, generalizes the Euclidean 
space of R” equipped with the dot product. All the geometry of angles, lengths, 
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volumes that we can define in Euclidean space have identical definitions in any 
inner product space. Sylvester’s Law of Inertia (Theorem 4.2.14) affirms that for 
any symmetric bilinear form, the signature is independent of the basis. Hence, we 
can generalize the notion of the Minkowski spacetime to the following. 


Definition 7.2.3. A vector space V of dimension n+ 1 is called a Minkowski space 
if it is equipped with a bilinear form (, ) that has signature (1,n,0) or (n, 1,0). 


From the perspective of bilinear forms, the difference between an inner product 
space and a Minkowski space appears minor. However, as we already saw in the 
previous subsection, there are significant differences between the geometry of an 
inner product space and a Minkowski space. Most notably, for a vector v € V, it 
is not always true that (v,v) > 0 or that (v,v) = 0 implies that v = 0. In an inner 
product space we define the length of a vector as \/(v,v) but this notion of length 
does not exist in the same way. Instead, just as when we discussed the difference 
between timelike, lightlike and spacelike separated points above, in any Minkowski 
space, there are regions in which (v,v) is positive, 0, or negative. If we do define 
distances, we must do so differently in each of these regions. 

For applications to special relativity, we usually use a Minkowski space with 
signature (n, 1,0). However, the difference between the geometry of vector spaces 
with bilinear forms of signature (1,n,0) versus (n,1,0) is immaterial, mostly a 
matter of terms. 

The concept of a light cone from special relativity inspires the following defini- 
tion. 


Definition 7.2.4. Let (V,(, )) be a Minkowski space. The null cone is the collec- 
tion of points # = (#°,a!,...,2”)' such that (z,Z) = 0. 


In an arbitrary Minkowski space, the null cone is a generalized cone with apex 
at the origin in that for all @ in the null cone, Az is also in the cone. The null cone 
separates the space into components in which (z,#) > 0 or (z,#) < 0. In special 
relativity, we called these regions spacelike and timelike separated. 


7.2.3 Physical Quantities in Special Relativity 


Let Z(A) be a parametric curve that traces out the world line of a particle. We 
define the four-velocity of the particle as the vector 


= =( dt dx dy =) 


a dr Ge de de a 


(7.22) 


This vector is tangent to the world line. 

In the geometry of curves in Euclidean space, the derivative dz/ds, where s is 
the arclength parameter, is the unit tangent vector. This is geometrically signifi- 
cant since the arclength function is independent of any regular reparametrization 
and independent of the position and orientation of the curve in space. Similarly, 
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the four-velocity is a vector whose identity is independent of the observer’s frame. 
However, its components will change between frames by the corresponding Lorentz 
transformation. 

We point out that 
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The four-momentum vector is defined as 
p=mU, (7.23) 


where m is the rest mass of the particle. In a frame F, the components of the 
four-momentum vector are 


p° E/c 
|p| — | mdax/dt 
[ply = p = m dy /dt ’ 
p mdz/dt 


(7.24) 


where E = p°c is the energy of the particle in the frame F and (p', p?, p®) are the 
components of its spatial momentum. 

The four-acceleration is 

dU 
g= —. 7.25 
dt ( ) 
Since U - U is constant, then d - U =0. 

Special relativity requires careful study to develop an effective intuition. This 
text has not provided any of the historical developments or experimental results that 
support this theory. We refer the reader to [24, 21, 42], each offering a comprehensive 
treatment of the subject. 


7.2.4 Pseudo-Riemannian Manifolds 


The definition of a Riemannian manifold arose from assuming a smooth manifold 
came equipped with an inner product on every tangent space, that varied smoothly 
across the manifold. The usefulness of Minkowski space for special relativity illus- 
trates that an inner product is not always what we might want for certain applica- 
tions. This inspires the following definition. 
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Definition 7.2.5. A pseudo-Riemannian metric on a smooth manifold M is a 
symmetric tensor field g of type (0,2) on M that is nondegenerate at every point. 
The pair (1, g) is called a pseudo-Reimannian manifold. 


In more detail, g is a global section of Sym? T7M* such that at each point p € 
M, we can have g,(X,Y) = 0 for all Y € 7,M if and only if X = 0 in T,M. 
This definition is looser than that of a Riemannian manifold since it has removed 
the positive-definite condition of inner products. Some authors refer to g on a 
pseudo-Riemannian manifold as a metric, whereas other authors prefer the term 
pseudometric to emphasize that g is not positive definite. 

From Sylvester’s Law of Inertia (Theorem 4.2.14), the signature of a symmetric 
bilinear form on a vector space is independent of the basis. More can be said for 
pseudo-Riemannian manifolds. 


Proposition 7.2.6. Let (M,g) be a pseudo-Riemannian manifold. Then on every 
connected component of M, the signature of g is the same. 


Proof. Recall that if (, ) is a symmetric bilinear form on a vector space V of di- 
mension n, then (, ) is nondegenerate if and only if the signature (s,t,7r) has r = 0. 
This condition is also equivalent to the coefficient matrix of (, ) with respect to 
some basis having a nonzero determinant. 

For each p € M, we consider the symmetric bilinear form g,. By definition, 
g: M > Sym? TM* is a continuous function. The coefficients of the characteristic 
polynomial of a matrix are polynomials, and therefore continuous, in the entries of 
a matrix. By the Spectral Theorem, since the bilinear form g, is symmetric, its 
matrix with respect to any basis is diagonalizable and all its eigenvalues are real. 
Consequently, we can order the eigenvalues of g, as functions A1(p) > A2(p) >--- > 
An(p). It is a well known result that the zeros of a polynomial vary continuously with 
the coefficients, even over C. [38, p.3]. Hence, the eigenvalue functions A; : M —> R 
are continuous. 

However, det(gp) = Ai(p)A2(p)-:-An(p) is continuous and never 0. Hence, 
Ai(p) # 0 for all p € M. From the proof of (Theorem 4.2.14), in the signature 
(s,t,r), the value s represents the number of eigenvalues that are positive, while t 
represents the number of eigenvalues that are negative. Since the eigenvalue func- 
tions are continuous and never 0, the number of eigenvalues that are positive and 
the number that are negative stays constant over any connected component. 


Definition 7.2.7. The signature of a pseudo-Riemannian manifold is the pair (s, t) 
of (M,g), where (s,t,0) is the signature of g, for each p € M. 


As we will see in Section 7.5, the theory of general relativity requires a model 
of space that is not flat but nonetheless behaves locally like Minkowski spacetime. 
Gravitational effects will cause the Lorentz metric to vary through space. A pseudo- 
Riemannian manifold of signature (3, 1) models this well. 
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A review of the proof of the Levi-Civita Theorem shows that the proof only 
used the symmetry and the nondegenerate (invertibility of the (g,;) matrix) aspect 
of the metric. Consequently, the following holds. 


Theorem 7.2.8. The Levi-Civita Theorem, Theorem 6.2.11, also holds for pseudo- 
Riemannian manifolds. The coefficients for the Levi-Civita connection are also 
given by the Christoffel symbols defined in Proposition 6.2.18. 


Despite this theorem, a few relevant changes arise in the following contexts: 
e We can no longer define the length of a tangent vector if g,(V,V) < 0. 


e We cannot define the arclength of a curve ¥ if 94,)(7'(¢), y'(a)) < 0 for some 
oel. 


e We might not be able to define the volume of a region R of M. 


Despite these possible obstructions, the equations for geodesics still satisfy the exis- 
tence and uniqueness properties of Theorem 6.3.12. Furthermore, like Proposition 
6.3.13, geodesics on pseudo-Riemannian manifolds have a constant (y'(c),¥‘(a)). 
Thus, geodesics come in three categories depending on the sign of (y'(o),7(o)) = 
gyV(0)5"(0). 

In the context of a Minkowski spacetime R*:!, where the metric g has signature 
(3,1), we say that a geodesic is 


e a timelike geodesic if g(y'(c), y/(a)) < 0; 
e a null geodesic if g(7'(c),7(c)) = 0; 


e a spacelike geodesic if g(y'(c),y'(o)) > 0. 


PROBLEMS 


7.2.1. Use the interpretation of the four-momentum in (7.24) to recover the energy- 
momentum relation E? = m?c* + p?c?. 


7.2.2. Action for a Relativistic Point Particle. The action of a free (no external forces) 
non-relativistic particle traveling between t = t; and t = tz is simply 


te to Fe 
s= | smut at = [ | 
mee ped lak 


and thus the Lagrangian is L = T = amv. To give a relativistic formulation for 
the action of a free particle, let us first assume we are in the context of a Minkowski 
space with coordinates described in Equation (7.21). We must describe the action 
in a way that is invariant under a Lorentz transformation. Therefore, we cannot 
directly use the particle velocity since the velocity is not a Lorentz invariant. This 
exercise seeks to justify the definition of the action of a relativistic point particle 
with rest mass of mo as 


2 
dt, 


S=-—me? / dr, (7.26) 
P 
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where we integrate over a world line P of the particle. According to (7.19), the 
action in (7.26) has an associated Lagrangian of 


y2 
Ls—me4/1= a (7.27) 


(a) Calculate the 6th-order Taylor expansion of 1 -—«2?, and show that the 
quadratic approximation to L is 


L& =m? + zm, (7.28) 


(b) Using Equation (7.27), show that the generalized momentum vector p and 
the Hamiltonian H satisfy 
2 
p= tl and AS 
uv? v 
(This formula for H conforms with the formula [24, (1-16)] for the total 
energy of a free relativistic particle.) 


7.2.3. Let (M,g) be a four-dimensional manifold with a Lorentzian metric that over a 
particular coordinate system (t, 2, y,z) has the matrix 


ke—gt? 0 0 gt 
a 0 1 0 0 
Ij = 0 01 0 


gt 0 0 1 
Show that the geodesics that have the initial condition (x, y, z,t) = (0,0, 0,0) when 
s =0 satisfy 
x= at, y = ot," and z= —jot" tet. 
Use this to give a physical interpretation of this metric. 
7.2.4. Let M be a pseudo-Riemannian manifold of dimension 3 with the line element 


1 
1— Ar? 


ds” = —dt? 4 dr? + r° de", 


where we assume r* < 1/\. Show that the null geodesics satisfy the relationship 


=) = r°(1— Ar?)(Cr? — 1), 


where C' is a constant. Use the substitution u = 1/r? to solve this differential 
equation, and show that the solutions are ellipses if we interpret r and 6 as the 
usual polar coordinates. 
7.2.5. Let g > 0 be a positive constant. Let M be a pseudo-Riemannian manifold of 
dimension 4 that has a line element of 
1 
1 — 29x 


—ds? = (1 — 2gax)dt? dx” — dy” — dz’, 


Show that the curve defined by (1 — 2gx) cosh?(gt) = 1, y = z = 0 is a geodesic 
passing through the origin. 
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/ 
Ss 
Figure 7.7: Stereographic projection from a hyperboloid. 


7.2.6. Determine the geodesics in a pseudo-Riemannian manifold that has the line element 
metric 


de sade? tae dy” + dz’. 
xz 


7.2.7. Consider a Lorentzian metric given by ds? = —dt? + f(t)?dx?, where f(t) is any 
smooth function of t. Show that the Einstein tensor is identically 0. 


7.2.8. Let H? be the upper half of the two-sheeted hyperboloid in R”*', defined by 
(ey total ey aR? and «”** > 0. 


Equip H% with the metric g = i*n, where i: H% > R"t? is the inclusion map and 
7 is the Minkowski metric expressed as 


n = (dx')? +--+ (dx”)? — (dx"*1)?. 


Define the manifold B® as the n-dimensional open ball in the x"t! — 0 hyperplane 
of R"* with center at the origin and radius R. Equip B%, with the metric g defined 
in Problem 6.1.12, namely 


oe [ape We) en) 


We define the stereographic projection 7 : H% — BR such that 7(p) = q is the 
unique point in BR on the line segment Sp, where S = (0,0,...,0,—R). Figure 7.7 
depicts this projection for n = 2. 


n n R n 
(a) Prove that m(x",...,2",2"t1) = Re 2 ). 


2 2 2 
(b) For u € B% C R”, show that 1 '(u) = ( ae ESE ‘). 


Fe — |lul[?? BR? — |lull? 
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(c) Show that (171)*g = g. 
(d) Deduce that (H%,g) is a Riemannian manifold (even though the metric is 


a pull-back from a pseudo-Riemannian metric) and that 7 is an isometry 
between Riemannian manifolds. 


7.3 Electromagnetism 
7.3.1 Maxwell’s Equations 


The goal of this section is to summarize the dynamics of a charged particle moving 
under the influence of an electric field E and a magnetic field B, both of which are 
time and space dependent. In no way does this brief section attempt to encapsulate 
all of the theory of electromagnetism. Rather we show how to pass from a classical 
formulation of a few of the basic laws of electromagnetism to a modern formulation 
that uses Minkowski metrics and the language of forms. (Note: all formulas in this 
section use CGS units, i.e., centimeters-grams-seconds units. In this system, force is 
measured in dyne, energy in erg, electric charge in esu, electric potential in statvolt, 
and the magnetic field strength in gauss.) 

The mathematical theory relies on the model (based on experiment) that point 
charges exist, i.e., particles of negligible size with charge. For example, the electron 
and the proton fit this bill. In contrast, magnetic monopoles — point-like particles 
with a magnetic charge — do not (appear to) exist. The observation of a single mag- 
netic monopole would change the rest of the theory (by adding an extra magnetic 
charge density and magnetic current) but even this “would not alter the fact that in 
matter as we know it, the only sources of the magnetic field are electric currents.” 
[46, p. 405] 

Coulomb’s law of electrostatic force states that the force between two point 
charges is inversely proportional to the square of the distance between them, namely, 


F=—f, (7.29) 


where q; and q2 are the respective charges of the particles, r is the distance between 
them, and 7 is the unit vector pointing from the location of the point charge 1 to 
the point charge 2. One then considers systems of charges, modeled by a charge 
density p(x, y,z), acting on a point particle with charge qg. The electric field of a 
charged system is the vector field E (2,y,2) = Lf where F is the force the system 
would exert on a particle of charge q at position (x,y,z). It is calculated by 


/ / / 
ps roto (CeO 2 =z) de ay dst 7.30 
[Lee \exarre oreo e (7) 


If the charge density p depends on time t as well, the E is a time dependent vector 
field E(a,y,z,t). An application of Gauss’s Theorem from vector calculus gives 
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Gauss’s Law for electrostatics, i.e., 
div E = 4rp, (7.31) 


where the divergence is only taken in the space variables. 
Consider the following function defined in terms of the charge density p: 


ao y', z! i) Para 
(2, y,2,t) = // dx’ dy dz’. 7.32 
os w Vea? +uy-vP+e-ZP Mee 


By taking the gradient with respect to the space variables (x,y, z) passing under 
the integral, we see that 


E=-Vo. (7.33) 


This shows that E is a conservative vector field. The function y is called the electric 
potential. The potential energy of the electric force field acting on a particle with 
charge q is V = qy. If the system of electrical charges is moving, then E, y, and p 
are also functions of time, but (7.30) and (7.33) still hold with the caveat that the 
integration and the gradient only involve the space variables. 

A system of time-dependent current density also induces what are called elec- 
trical currents. The current density is the vector field J that at each point (2, y, z) 
measures the direction of the current and how much current is passing per area and 
per time. A direct application of Gauss’s Theorem from vector calculus gives 


div J=——. (7.34) 


At the heart of electromagnetism lies an interdependence between magnetic 
fields and electric fields. A charged particle that is moving in the presence of a 
current experiences a force perpendicular to its velocity. That force acting on the 
particle is called the magnetic force. The magnetic field of a system of charges is 
the field B defined implicitly by 


wx B). (7.35) 


This overall effect on a particle with charge q is called the electromagnetic force. 
It is no longer conservative due to the presence of U. Nonetheless, we define the 
magnetic vector potential A by 


/ / / 
t 
A(x, y,z == /f (21 y, 24) dx’ dy’ dz’. (7.36) 
R3 JV(a a)? Ae 


Furthermore, Faraday discovered that not only does a time-dependent distribu- 
tion of charge induce a magnetic field, a variable magnetic field similarly affects 


— 
Kad 
s 
Na 
i) 
— 
R 
XR 
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the electric field. The relationship between the electric and magnetic fields can be 
summarized by two separate sets of equations: Faraday’s law for potential, i.e., 


j » 110A eS 
B= B=VxA, (7.37) 


V:-E=Anrp, V:-B=0, 
4, 4 VOB s+ » 2OP . 4s (7.38) 
VtbSe = Yeas oo 7 

* c Ot’ = c Ot = c 


Maxwell’s equations stand as a crowning achievement in electromagnetism. They 
encapsulate the interdependent phenomena of induction and the static source of the 
various fields. Furthermore, solving the equations for empty space (i.e., p = 0 and 
J= 0) leads to an interpretation of light as an electromagnetic wave. 

Hidden in Maxwell’s equations lie relativistic effects. If a charged particle travels 
fast (a non-trivial fraction of the speed of light), then due to relativistic effects, its 
electric field appears distorted to a stationary observer. Lorentz transformations 
in (7.17) describe how the electric and magnetic fields look different in different 
moving frames of reference. 


7.3.2 Covariant Formula of Electromagnetism 


Having developed considerable analytical machinery in the previous chapters, we 
are in a position to reformulate the theory of electromagnetism in a more concise 
way. We work in a four-dimensional Minkowski spacetime, which means we use the 
pseudometric g = 7, as defined in Section 7.2.2. As before, we label the coordinates 
aaa 1 2 


ct, 1 = a2, x? = y, and x3 = z. 
Define the 4-vector potential A as the covector (1-form) with components 


A; = (—y, Aj, Ag, As). (7.39) 


We call the electromagnetic tensor F the 2-form 
3 38 
F=- S- E;dx® A dx’ + S- B;(*dz") 
i=1 i=1 


= —F dx® A dz! — Ey dx® A dx? — E3dx° A dx? 
+ By dx? A da® — Bz dx! A dx? + Bz dx! A dz”, 


where by * we mean the Hodge star operator acting only on the space variables. If 
we exhibit the components of F in an antisymmetric matrix, we write 


0 -F, -E. —Es 
_ Ey 0 Bg —Bo 
f= Ey -B; 0 B, |° (7.40) 


E; Bo —B, 0 
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(As always, we use the convention that in F),,, the index 4 corresponds to the row 
and v corresponds to the column of the representing matrix.) In Problem 5.4.7, we 
showed that Faraday’s law for potential from Equation (7.37) can be expressed as 


Fup = 0aAg —OpAa- (7.41) 


Example 4.5.9 showed that a collection of component functions defined this way in 
terms of a covariant field A does define a tensor field of type (0, 2). 

We also define the 4-current vector by J = (cp, J', J*,J°), where p is the 
charge density and (J+, J?, J?) = J is the classic current density vector. Using 
the Minkowski metric 7, recall that by F°? we mean the raising-indices operation 
Fee — line” Fy and similarly for the lowering operation Ja = NagJ®. Recall 
that we write J’ for the covector associated to J. In coordinates, we have 


O° Bp SBS: By 
=B> 0- .Bac Bs 
SBS EBs O°) - By 
=f, By By ~ o 


Foe — and = Jo = (—e9, J", J7, 3°). 


With this setup, it is not hard to show that Maxwell’s equations can be written in 
tensor form as 4 
FP = 78, and e%4(0,Fxg) =0, (7.42) 
c 
where the last equation holds for all 6 = 0,1,2,3. The second equation in (7.42) 
can be written equivalently as 


e°P1 (0, Fup) = Oy Fae + oF ay + OF ya = 0. (7.43) 


Using 4-vectors, one can describe the potential between the current 4-vector and 
the potential 4-vector. First, we define the D’Alembertian operator as 
om Oo? 0? ie 


~ Ox2 ! Oy? ! Oz2 2 Ot?" i) 


We point out that the D’Alembertian is equivalent to the Laplacian in the (x°, x1, x?, 7°) 


with the Minkowski metric. Since 2° = ct, we have 
oe oe 
O29 Ot 
So V = (1/c0;, 0x, Oy, 0,). The Laplacian is V? = V- V so 


va oO? o2 oO? 62 
~ A(x)? : O(a)? ' O(a?)? O(a)? 


Since 0/dx° = 40/dt, we see that the D’Alembertian operator is the same as the 
Laplacian for this context. 
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Applying Equation (7.34) to Equations (7.36) and (7.32), one can show that 
V-A=-—+0y/dt. Thus, taking the divergence of E expressed in Equation (7.37), 
we obtain 


p= —Arp. (7.45) 


Using similar calculations, we can also show that 


4 
A - LS. (7.46) 
Cc 


7.3.3 Electromagnetism Expressed in Differential Forms 


Much of the reformulation of Faraday’s and Maxwell’s equations in the previous 
paragraphs can be expressed in even simpler terms using differential forms. We still 
work under the assumption that we work in Minkowski space R*!. Faraday’s law, 
expressed classically as (7.37) and in covariant components in 7.41, simply means 


dA =F. (7.47) 


Interestingly enough, this formula does not refer to any metric but simply claims 
that the electromagnetic tensor F is an exact 2-form. 

Since F is exact, it is also closed with dF = 0. This property again has nothing 
to do with a metric. It is easy to check that it corresponds to the second and third 
Maxwell equations in (7.38). In Section 6.2.5, we mentioned the divergence operator 
on any tensor field over a Riemannian manifold. In order to take the divergence 
on a covariant index, we first need to raise that index. If we take the divergence 
operator of F in the first index, by (6.25) in components it is 


(9 Fya)t 7 9 Fip;i, 
where we are using the covariant derivative associated to the Levi-Civita connection. 
It is straightforward to prove that Maxwell’s first and fourth equations are equivalent 


An , 
to div F = —J’. Hence, we can write Maxwell’s equations as 
c 


4 
divF=—J’ and dF=0. (7.48) 
Cc 


This formulation of Faraday’s equation and Maxwell’s equations lends itself to 
generalization from Minkowski space to a pseudo-Riemannian manifold of signature 
(3,1). In fact, dF = 0 follows immediately from F = dA but the divergence operator 
in the first equation in the above pair depends on the metric of the manifold. 


PROBLEMS 


7.3.1. Suppose we are in R?*1, and let F be the standard reference frame. Suppose that 


another frame F’ keeps the z-, y-, and z-axes in the same orientation but has an 
origin O’ that travels at velocity v along the z-axis of F. Let E and B be joint 
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7.3.2. 
7.3.3. 


7.3.4. 


7.3.5. 


7.3.6. 


electric and magnetic force fields with coordinates (£1, F2, £3) and (Bi, Bo, Bs) 
as observed in F. Use the electromagnetic tensor from Equation (7.40) and the 
coordinate transformation described in (7.17) to show that in F’ the components 
of the same vector fields are observed as having the components 


E, = Fi, Ey = 7(E2 — BBs), E3 = (Es + BB2), 


7.49 
By, = Bi, Bo = 7(B2 + BEs), B3 = 7(Bs — BE2). _— 


(This result conforms to standard results of special relativistic effects in electro- 
magnetism. [46, (58) Chap. 6].) 


Show that (7.42) is equivalent to (7.38). 


Let f be a smooth function defined over the Minkowski space R®'. As always, set 


x =ct,c'=2, 2? =y, and x? = z. Prove that 


d(*(df)) = cCOfdt A dx A dy A dz. 


Suppose we are in Minkowski spacetime. 


(a) Prove that 1 po? Fg = |||? — ||B\?. Conclude that ||E||? — || Bl? is pre- 
served under any Lorentz transformation. 

(b) Prove that —jnij(*F)*n" Fir = E. B. Conclude that E - B is preserved 
under any Lorentz transformation. 


Recall x as the Hodge star operator. Show that in Minkowski spacetime with the 
metric 7, the operator xdx is the same as the divergence operator div over the first 
index. Conclude that Maxwell’s equations equations can be expressed as 


dF = 0 


7.50 
gape vay 
Cc 


Let M be any pseudo-Riemannian manifold. Consider the operation that consists 
of the compositions xd x d. 


(a) Show that xd «d is an R-linear operator *(M) — 0*(M) for k < dim M. 


(b) Let MR” be a standard Euclidean space. Recalling that Q°(M) = C%(M), 
show that for any smooth function f, 


xdxdf =V"f, 
where V? is the usual Laplacian V? = sine eee aoe 
(c) Suppose we are in Minkowski space. Show that DO = «dx d, and conclude 


that Equations (7.45) and (7.46) can be summarized by 


xd*dA = zy, 
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7.3.7. In [60, (5.38)], the author states that “the full action for the electrically charged 
point particle is” 


S= =moe f dr + a A, dx", (7.51) 
P CIP 


where dr is given by (7.19) and A, are the components of the potential covec- 
tor given in Equation (7.39). Suppose a charged particle travels along a path 
(x* (4), 2° (4), 2°(4)). 

(a) Write the action in Equation (7.51) as an integral of time ¢ alone. 


(b) Determine the Lagrangian for this system, and write down Lagrange’s equa- 
tions of motion. 


(c) Write down Hamilton’s equations of motion. 


7.4 Geometric Concepts in String Theory 


What is generically understood in physics as string theory is a collection of theories 
called superstring theories. The name of these models derives from the fact that 
in many of the first proposed theories, elementary particles were viewed as strings. 
Since then, theories have been formulated in terms of points or surfaces. The string 
can be either open on the ends or be a closed loop. For theoretical reasons, the length 
of the strings should be on the order of the Planck length, ¢p = 1.6162 x 107°° m. 
This size is so small as to render it impossible to directly observe the string structure 
with present technology or, so it would seem, with technology that will be available 
in the near future. In this model, observed properties of the particle, such as mass 
or electric charge, arise as specific properties of the vibration of the string. 

A string in common day occurence is made of some material like thread or 
wire. One could ask what these strings are made of, i.e., what is the nature of 
the “thread.” This type of question is, however, vacuous because the string is not 
made up of any constituent parts. One should rather think of the particle-wave 
duality that drew considerable debate during the inception of quantum mechanics. 
In this duality, under different circumstances, a particle would exhibit behavior like 
a billiard ball while in other circumstances it would display a wave-like behavior. 
While some physicists discussed the fundamental nature of particles, many simply 
emphasized the fact that growing experimental evidence supported the probability 
wave function model, without worrying about the ontology. 

As a refinement to the Standard Model of quantum mechanics, string theory 
bears a similar duality in that one thinks of the particle as having a string nature as 
well as a probability wave nature. The space of the “state” functions (i.e., functions 
that describe the state of the particle) is the same, but there are more operators 
than in the point-particle theory. In practice, instead of debating the nature of the 
string, the theories work out mathematical consequences of this formulation in the 
hope that the resulting theory agrees with experimental observations and unifies 
without irreparable inconsistencies with previously established theories. 
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Figure 7.8: The world sheet of a (nonrelativistic) closed string. 


Our goal in this section is to introduce a few of the geometric notions that un- 
derlie the relativistic dynamics of a string. Issues of quantization of these dynamics 
exceed the scope of this book. 

We first consider the nonrelativistic dynamics of a string of length Z in Euclidean 
R”. If the string is open, we can pick an end of the string and use the arclength 
parameter to locate a point on the string. If it is closed, we pick a specific point on 
the string and locate other points on the string using the same arclength parameter. 
The position of the string in space at time t is described by a smooth function 
X : (0, LZ] x RR”, where X(s,t) is the location of the point of position s on the 
string at time t. Therefore, while the trajectory of a classic particle is described by 
a curve in R”, the “trajectory” of a string is a surface (see Figure 7.8). In keeping 
with the terminology of “world line” for a relativistic point particle, the surface S 
is called the world sheet of the string. 

To study the dynamics of a relativistic string, we must work in the context of 
a Lorentzian spacetime. (This can be curved or flat and can have any number of 
space dimensions but only one time dimension. In other words, the pseudometric 
on the space has index 1.) As always, the coordinates in the spacetime are 24 = 
(2°, a1,--. , 2%), with d being the number of space dimensions and x° = ct. 

One can no longer parametrize the world sheet S with the time parameter t since 
x° = ct is one of the coordinates in the target space. Nonetheless, the world sheet 
requires two parameters, say €! and €?. Furthermore, we can no longer give the 
same definition of the domain of X as in the nonrelativistic description of moving 
strings. One refers to the domain of X as the parameter space for the world sheet. 

Now we encounter something new in Lorentzian spacetime that we never en- 
countered in the study of Riemannian manifolds. The world sheet S must be such 
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that at each point there exists at least one spacelike tangent vector and at least one 
timelike tangent vector (recall Section 7.2.4 for the definitions). It is not hard to 
see the need for a spacelike tangent vector. Any point in time corresponds to a slice 
of x°. Intersecting such a slice with S gives the locus of the string at a given time. 
Any point on the string in this slice will have a tangent vector that is a spacelike 
vector. On the other hand, if there did not exist a timelike tangent vector at some 
point on S, one would interpret that as that point not having any evolution through 
time. This is not a physical situation. Hence, at each point P of S, the tangent 
space has both a timelike direction and a spacelike direction. This is the criterion 
for motion of the string. 
If g is the pseudo-Riemannian metric of the spacetime target space, then the 

induced metric g on the tangent bundle T'S is defined by 

Z OX OX 

Fiji =9 es ae : 
The criterion of motion for the string is equivalent to g having index 1 at all points 
of S. 


Proposition 7.4.1. Let X (£1, €?) be a parametrization for a surface S in Lorentzian 
space with metric g such that at each point of S, X has at least one nontrivial space- 
like tangent vector and at least one nontrivial timelike vector. Then 


det(jis) = det (9(2*., 2*)) 


BEF BEI (7.52) 


at all points of S. 
Proof. Let p be a point on S, and let V(a) be the vector in T,S defined by 
Ox Ox 
Via) = COR Ct + sino 
for a € [0,27]. Then 
OX OX OX OX OX OX 
2 2 + *, DB 
||V (a)||- = cos ag (Ser oe) + 2sinaccosa.g( Fr, ae) + sin ao ( Ser ey 
= cos? a 911 + 2sina cosa G12 + sin? a goo. (7.53) 
The property of tangent vectors of being timelike or spacelike is independent of the 
length or sign of the vector. Thus, for some a, there exists a vector V(a1) such 
that ||V(a1)||? < 0, and for some az there exists a V(a2) such that ||V(a2)||? > 0. 
Furthermore, ||V (a; + 7)||? = ||V(a;)||?.. Hence, since ||V(a)||? changes sign twice 
over a € [0,7], it must have at least two distinct roots. Therefore, Equation (7.53) 


leads to quadratic equations in tan a or cot a. Either way, according to the quadratic 
formula, the equation for tana or for cot @ has two distinct roots if and only if 


Tho — 911922 > 0. 


The proposition follows immediately. 
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It is customary to parametrize S with two variables labeled o and 7 defined 
in such a way that for all points in the parameter space, 0X/0c is a spacelike 
tangent vector and 0X/0Or is a timelike tangent vector. The parameters o and 7 no 
longer directly represent position along the string or time, respectively. One could 
say that o and 7 approximately represent position and time along the world sheet. 
In fact, with the sole exception of the endpoints, when considering the motion of 
open strings, one cannot know the movement of individual points on the string. In 
general, o ranges over a finite interval [0,01], while 7 ranges over all of R. 

The derivatives 0X /Oo and 0X/Or occur often enough that it is common to use 
the symbols X’ and X for them, respectively. In components, we write 

Ox" Ox" 


x’ = X*¥ = . 
do oat Or 


The area element of the world sheet in this Lorentzian space is defined as 


dA = /~ det g = y/9(X, X")? — G(X, X)g(X", X”). (7.54) 


We finish this section by briefly discussing the Nambu-Goto action for a free 
relativistic string and the resulting equations of motion for the string. 


Definition 7.4.2. Let g = (, ) be a pseudometric of signature (1,1). The Nambu- 
Goto action of the string is defined as 


T T T2 O1 
ed -= ff aa ="? [ | / — det Gag do dr 
Ss 1 0 


Cc 


T T2 O1 ; ‘ 
== ff X92 XP aodr, (755) 
T1 0 


where To is called the string tension and c is the speed of light. 


Before proceeding, we must give some justification for this definition. First of 
all, it mimics the action for a free relativistic particle given in (7.26). The difference 
is that instead of defining the action as a multiple of the length of the path in the 
ambient Minkowski space, we define it as a multiple of the area of the world-sheet. 
Furthermore, this action is obviously invariant under reparametrization since the 
area is a geometrical quantity. 

As a more convincing argument, we consider a classical vibrating string of length 
£. Using Figure 7.9 as a guide, we model the motion of the string by a function 
y(a,t) that measures the deviation of the string from rest at horizontal position 
x and at time t. If the string has constant density uo and tension Jo, then the 
differential equation of motion for a string with small deviations is 


ia 
Ot x?” 
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Figure 7.9: A vibrating string. 


The fraction j19/Ty has the units of time? /length? and is in fact equal to 1/c?, where 
c is the speed of propagation of the wave. It is not hard to reason that at any point 
in time ¢, the total kinetic energy of the string is 


Thus the Lagrangian of the system is 


b= fae) ~ 3 (5E) ae 750 


The integrand of Equation (7.56) is called the Lagrangian density and is denoted 
by £. It is explicitly a function of Oy/Ot and Oy/Ox. The action of the system for 


te ti, ta] is 
1 Ayy? a 
y y 
S= (=) -To( 3 dx dt. 7.57 
[ i gO Se 9° Se x ( ) 
Now assume that we are in a Minkowski space R!:? with the flat pseudometric 
ds? = —(dxr°)? + (dx!)? + (dx?)?, where x° = ct, 21 = x, and x? = y. After some 


manipulation, we can rewrite that string action as 


-3f EG (-(F. Ox 4) + (34) ] dx’ dx®. (7.58) 


The motion of the string can be parametrized in R!? by f(x°, 2!) = (x°, x!, y(x°, 2')). 
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With the inner product induced by this metric, the area element becomes 
(se 5a) pee 5a) (pew aa)’ ) = i= Ga) + BEY 
=1+3(- (a5) + (gz) ): 


Adjusting for 2° = ct, the Lagrangian associated to Equation (7.58) differs from 
the linear approximation to the Nambu-Goto action 


ty f 14 3(— (24) + (24) a 


by 
é 
-1 | Lda) = —Tyé = —polc* = —mce?. 
0 


Similar to the linear approximation to the Lagrangian for the free relativistic parti- 
cle in Equation (7.28), this difference is precisely the negative of the rest energy mc? 
of the string. Since this is a constant, it leaves the Euler-Lagrange equations un- 
changed. This shows how the classic Lagrangian of a wave is a linear approximation 
for the Lagrangian associated to the Nambu-Goto action. 

We now wish to obtain the equations of motion associated to the Nambu-Goto 
action. The Lagrangian density in Equation (7.55) is 


. T : ; 
L(KH, XM) = P(X, X71)? = XIX, (7.59) 


This is an explicit function of the eight variables X’" and X* for pw = 0,1, 2,3. 
Hamilton’s principle states that the system will evolve in such a way as to minimize 
the action. According to a generalization of the Euler-Lagrange Theorem in the 
calculus of variations (see Problem 7.4.2), the Nambu-Goto action is minimized if 
and only if the X“(s,t) satisfy 


(a (0 

da \0X'/) ' dr\axu] — 

for all yu. These are the equations of motion for a relativistic string, whether open 
or closed. More explicitly, the equations of motion read 


O [(X,X') gu X” = XP gu X” gud (XX) uv X = |X" PguvX” 
@ (Xx? - |X]? "1? ” Xx? - |X}? 7 


(7.60) 
for uw = 0,1,2,3. At first glance, these equations are incredibly complicated. They 
involve a system of four second-order partial differential equations of four functions 
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each in two variables. A remarkable fact among the basic results of string theory 
is that it is possible to solve Equation (7.60) once one makes a suitable choice of 7 
and Tf. 

Using the notion of generalized momenta defined in Equation (7.6), we define 
two momenta densities P° and P* that are cotangent vectors on the world sheet 
with components 


def To (X, gg x _ XW? gun X 


Po = 
7 5 5 : 
i VAX, X12 = X12 XP 
¥ / Iv 1\2 ‘ay (7.61) 
pr def To (XX!) GuvX"” = |X" ||P Gur X 
7 : 


CX, x1? — XII? 


(One should note that in this case the superscript o and 7 in Py and P/, are not 
indices but are parameter indicators.) Then the equations of motion read 


OP? apt 
es + ae =0. (7.62) 


This is all we will say about the underlying geometry in string theory. String 
theory extends well beyond the scope of this book, and we encourage the reader to 
consult [60] for an artful and accessible introduction to the subject. 


PROBLEMS 
7.4.1. Show that at some point on the world-sheet of a string, if the point moves at the 
speed of light, there is no timelike direction. 


7.4.2. Use the methods of calculus of variations provided for the proof of Theorem B.3.1 
to prove the following result. Let x1(s,t),...,2”"(s,t) be n twice-differentiable 
functions in two variables. Denote derivatives by 2”? = dx'/ds and «' = da’ /dt. 
Suppose that a function f is given explicitly in terms of ','',<',s, and t. Show 
that the integral 


tg 82 
i) / f(c',...,2",0",...,0",a',...,2",8,t) ds dt 
ty S81 
is optimized when 
Of 7 (2f) ¢ (25) =o 
Ox? = ds \Ox" dt \Oxz* 


for alli =1,...,n. 


7.4.3. Consider a free relativistic string with o-length 01. The Hamiltonian for the system 
is O71 - 
A ay Pix" —Ldao. 
0 
(a) Recover the equations of motion as in Equation (7.62) from Hamilton’s equa- 
tions of motion. 


(b) Show that H vanishes identically for all r. 
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OXHOXY 
Show that this matrix has two 0 eigenvalues, with eigenvectors X and X’. 
Deduce the following conditions on the momentum P”: 


(c) Let £ be as in Equation (7.59). Consider the matrix with entries 


iPS PTX =O, 
To Metpasets, be . 
SIX? =o" PLPC + Sou XX" = 0. 


T 2 
Pale 
7.4.4. Show that according to the relativistic string equations of motion, the endpoints 
of an open string move with the speed of light. 


7.4.5. Consider a relativistic string in Minkowski space R®’ but only consider the history 
of the string in the real space. We parametrize this history as X(o,7). (We use the 
vector superscript to indicate vectors in the Euclidean R part of the spacetime.) 
Define s(c) to be the length of the string along [0,0], so that s(0) = 0 and s(o1) is 
the length of the string. Also set t = T. 


Ox , : 
(a) Prove that —— is a unit vector. 


Os 


on that is 


(b) Define the vector %_ as the component of the velocity vector 


perpendicular to the string. Thus, 


Ge Boe 


where we use the usual dot product. Prove that one can write the Nambu- 
Goto string action as 


tg OL 2 
S=-T f - a8 fy — Uh de dt 
t, Jo do c 


7.4.6. (*) The Nambu-Goto Bubble Action. Suppose that instead of considering particles 
as strings, we model them as bubbles. Then a world sheet S is given by a function 
X(o,@,7) into a pseudo-Riemannian manifold M with signature (2, 1). 


(a) Explain why it still makes sense to define the action of the free motion of 
the relativistic bubble for 7, < 7 < 72 by 


S= -2 fff / — det gag do da dr, 
S 


where Jo is now a surface tension and g is the metric induced from M on S. 


(b) Write down the equations of motion associated to this action. 
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n. T(r 


dA 


Figure 7.10: Stress tensor: an area element in a continuous medium. 


7.5 Brief Introduction to General Relativity 


As with the previous sections, the reader might consider it outlandish that we only 
allow one section to discuss general relativity. General relativity is a vast subject 
with contributions from a host of scientists and mathematicians, and it stands 
alongside quantum mechanics as one of the most revolutionary ideas in physics of 
the 20th century. 

On the other hand, most textbooks on general relativity take a considerable 
amount of time to develop the techniques of analysis on manifolds, in particu- 
lar, pseudo-Riemannian manifolds. However, these are precisely the mathematical 
methods we have developed in the previous chapters, so we are in a position to 
introduce some differential geometric concepts in general relativity as applications. 

From the perspective of mathematical structures, general relativity builds on 
special relativity. The postulates of special relativity brought us to the notion of 
spacetime, which is a Minkowski space R?:!. In general relativity, we will want 
to consider our space as locally Minkowski, meaning that each tangent space is a 
Minkowski space. Hence, we model the universe as a pseudo-Riemannian manifold 
with signature (3,1). More importantly, the Einstein field equations propose a 
relationship between the presence of energy and the curvature of this spacetime 
manifold. 


7.59.1 Stress-Energy Tensor 


In the mechanics of elastic media, one encounters the concept of a stress tensor, 
which is a tensor-valued function defined at each point within the body or medium. 
Suppose the body is in equilibrium but subject to external forces and/or body forces 
(i.e., forces that act through the whole body). Then there must exist internal forces. 
Let Q be a point, 7 a vector based at Q, and consider the area element AA that 
is in the plane perpendicular to 7 and has area equal to ||7|| (see Figure 7.10). Let 
AF be the overall internal forces distributed over the area element AA. The stress 
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vector through the area element AA is the vector 


es _ AF 
It is not hard to show that the function T(7) is a linear function.[56, Section 10.6] 
Thus the stress tensor with respect to an orthogonal basis B based at the point Q 
is the matrix o such that 

T(r) =o [7] B 
Consider now a small rectangular parallelepiped with sides parallel to the coordinate 
planes. The stress acts on each face as depicted in Figure 7.11. Then the columns 


of o are given by 


oe; = T(é) = 072 


Minimal assumptions that are logical for physics (angular momentum in a medium 
cannot grow to be infinite at a point) imply that the stress tensor o is symmetric. 

As a simple example, in an ideal fluid, the stress on any small area element 
is composed only of pressure, and there is no shearing force. Consequently, the 
stress tensor is o = PI, where P is the pressure and I is the 3 x 3 identity matrix. 
(This restates the claim given in calculus texts on the applications of integration 
to hydrostatics when one says that “at any point in a liquid the pressure is the 
same in all directions.” [55, p. 576]) The stress tensor arises also in the dynamics of 
viscous fluids where it is no longer necessarily diagonal. The stress tensor at a point 
“may be a function of the density and temperature, of the relative positions and 
velocities of elements near [the point], and perhaps also the previous history of the 
medium.” [56, p. 434] This characterization describes the stress tensor as a function 
of many ambient quantities, but the reference to “relative positions” indicates that 
the stress tensor need not be diagonal. 


Einstein’s equation in general relativity involves the so-called stress-energy ten- 
sor. This tensor is different from the stress tensor but is based on the same concept. 
We assume that we are in a Minkowski space with metric g of signature (3, 1). 

As in special relativity, the four-velocity of a particle on a world line P parame- 
trized by Z is the tangent vector along P given by 


di 


where, the proper time 7 of an object along its world line is given in (7.20). From 
the theory of special relativity, by (7.22), the four-velocity is 


U = (U°,U", U?,U) = (ye, Wa, Wy, Wz); (7.65) 
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T(é3) 


Figure 7.11: Action of stress on an infinitesimal coordinate cube. 


where y = (1 — v?/c?)~'/? and @ = (da/dt, dy/dt, dz/dt). In (7.23), we defined the 
momentum 4-vector of a particle of rest mass m by p = mt. Equally useful is the 
four-momentum covector given by 


p=—. (7.66) 


In special relativity where the metric g = 7 is the standard Minkowski metric, the 
components of the momentum covector are 


EB 
(Po, P1; P2,P3) a Gao ’ (7.67) 


where (px, Py, Pz) = Mo7(v', v?, v?) is the relativistic 3-vector momentum. 

Underlying the assumptions that define the stress-energy tensor, we assume that 
“spacetime contains a flowing river of 4-momentum” [41, p.130]. Any mass that is 
moving or anything with energy contributes to the 4-momentum. We could think of 
an individual particle, in which case the 4-momentum would only be defined on the 
particle’s world line, or we could consider a system of many particles carrying this 
4-momentum. In the latter case, we should think of the 4-momentum as a covector 
field on the spacetime manifold M, that is, as a 1-form. 

Let n be any 1-form on M. Then at each point @ € M, ng is perpendicular 
(using g = (, ) at Q) to a three-dimensional subspace of TgM. This subspace can 
be spanned by vectors Ag, Bg, and Cg such that 


ng (w) = —Volg(u, Ag, Ba, Cg), (7.68) 


where on the right-hand side we mean the 4-volume (with respect to g) of the 
4-parallelepiped spanned by u, Ag, Bg,Cgq € TgM. Then the 3-volume of the 3- 
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parallelepiped Ag, Bg,Cgq is the length ||n||. In this way, every 1-form represents 
a volume element in R+. 

We define the stress-energy tensor T of type (2,0) by T(p,n) being the flux of 
the momentum covector p across a volume element represented by n. At each point 
Q € M, we have 


T(p, n) = (p, N)g- (7.69) 


Over any coordinate chart (a), the components of T are 
T°? = T(dx%, dx®) 


to represent that flux of unit four-momentum in direction dx® across a volume- 
element of constant 6. 

The stress-energy tensor T is also called the energy-momentum tensor because 
it contains information pertaining to the momentum flowing through space and 
the presence of static or moving energy in space. The name “stress-energy tensor” 
is commonly used since it is modeled off the stress tensor in mechanics of elastic 
media. 

The following gives a summary of the information included in the stress-energy 
tensor. Assuming j,k > 0, 


T° = density of energy (including mass), 

T!° = jth component of the momentum density, 
T°* = kth component of the energy flux, 

T)* = (j,k)th component of the momentum stress 


= kth component of the flux of the jth component of momentum. 


The notion of flux in this context refers to a similar limit as in Equation (7.63) but 
in the situation where one is concerned with the movement of something (energy, 
fluid momentum, heat,...) through the infinitessimal area element dA. In fact, 
with this particular concept of flux, one can define the stress-energy tensor in short 
by saying that T*) is the kth component of the flux of the jth component of the 
4-momentum. 

We now state two facts about the stress-energy tensor that we do not fully justify 
here. 


Proposition 7.5.1. The (contravariant) stress-energy tensor T is symmetric. 


The symmetry in the components 1 < 7,7 < 3 follows from the same physical 
reasoning for why the stress tensor in fluid dynamics is symmetric. 


Proposition 7.5.2 (Einstein’s Conservation Law). The conservation of energy is 
equivalent to the identity 
divT = T°, =0, 
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Proof Sketch. Suppose that energy is conserved in a certain region of M. In other 
words, though energy and mass may move around, no energy or mass is created or 
annihilated in M. Then given any four-dimensional submanifold V with boundary 
OY, the total flux of 4-momentum passing through OV must be 0. We can restate 


this as 
// T-dvV® =0, 
ov 


where dV“) is the 3-volume element with direction along the outward-pointing 
normal vector to OV. (We can view this as a volume 1-form.) By the product - we 
mean the contraction of T with the volume 1-form element dV“). Stokes’ Theorem 
applied to pseudo-Riemannian manifolds gives 


[fff rv = [ff v-av =o 


Since this is true for all V as described above, using a limiting argument similar 
to that used to show Gauss’ Law that div E = 0 for an electric field, a limiting 
argument establishes div T = 0 everywhere. 


Example 7.5.3 (Perfect Fluid Stress-Energy Tensor). A perfect fluid is a fluid in 
which the pressure p is the same in any direction. The fluid must be free of heat 
conduction and viscosity and any process that can cause internal sheers. Using the 
interpretation of T from Equations (7.70)—(7.73), we see that TJ* = 0 if 7 A k and 
0 < j,k. Furthermore, since the pressure is the same in all directions, T¥7 = p for 
j = 1,2,3. For components involving 7 = 0 or k = 0, we first have T°° = p, the 
energy density. This quantity includes mass density but also other types of energy 
such as compression energy. For the remaining off diagonal terms T°? = T°, these 
are 0 because of the assumption that there is no heat conduction in the perfect 
fluid. Thus, the stress-energy tensor has components 


p 000 

, 0 p00 

Se a opto (7.74) 
00 0p 


If we suppose that an observer is in the Lorentz frame that is at rest with respect 
to the movement of the fluid, then the velocity has components u® = (1,0,0,0). 
With respect to the Minkowski metric 7, we can write (7.74) as 


T°? = (p+ p)uru? + py. 
We can rewrite this in a coordinate-free way in any metric as 
T=pg '+(p+p)u®u, (7.75) 


where we have written g~+ for the contravariant tensor of type (2,0) associated to 
the metric tensor g. 
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Example 7.5.4 (Electromagnetic Stress-Energy Tensor). Directly using the inter- 
pretation of T given in Equations (7.70)—(7.73) and results from electromagnetism, 
which we do not recreate here, one can determine the components of the stress- 
energy tensor for the electromagnetic field in free space. If F'“” are the components 
of the electromagnetic field tensor, then 


1 1 
Gor = = (Fo PP - °F Fw) in SI units (7.76) 
Ho 
1 OL Bv 1 ap Puy : : 
=e Fg, FPY — 79 FUR, in CGS units (7.77) 


where fig = 4m x 10-7 N/A~? is a constant sometimes called the vacuum perme- 
ability. 


As the context requires, we may need the stress-energy tensor to be of type (2, 0) 
as defined, of type (1,1) or of type (0,2). To pass between any of these, we raise or 
lower the indices as needed using the metric. 


7.5.2 Einstein Field Equations 


The Einstein field equations (EFE) are the heart of general relativity. They stem 
from the juxtaposition of the two following principles: 


1. Every aspect of gravity is a description of the spacetime geometry. 
2. Mass (energy) is the source of gravity. 


The metric tensor g encapsulates all the information about the geometry of 
the spacetime. The metric g has the associated Levi-Civita connection V, the 
Riemann curvature tensor R of type (1,3), the Ricci curvature tensor Re, and 
the scalar curvature function R, defined in Chapter 6. (Note: In math texts on 
Riemannian geometry, one often denotes by S the scalar curvature while texts on 
general relativity invariably denote it by R. Using the bold font R to indicate the 
curvature tensor alleviates any confusion between the scalar and tensor curvature.) 


On the other hand, the stress-energy tensor describes the spacetime content 
of mass-energy. In fact, any observer with 4-velocity U measures the density of 
mass-energy as 

p=u-T-u=Tyguru’. 


In order to put together the two above principles, we should be able to write 
the tensor T exclusively in terms of the components of the metric tensor g. The 
conservation of energy states that divT = 0. Also, if T is to serve as a measure 
of the curvature of spacetime, we propose that it should explicitly involve only 
components of R and of g (no derivatives of any of these terms) and it should be 
linear in the components of R. It turns out that under these restrictions, there are 
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only a few options for a geometric description of T. In Problem 7.5.2, we show that 
for purely mathematical reasons, these constraints impose that 


1 
Top = C (Fos ~ 5 Roos + gna ; (7.78) 


where Rag are the components of the Ricci curvature tensor, R is the scalar curva- 
ture, and A and C are real constants. This leads to Einstein’s field equations. 

Let G be the Einstein curvature tensor described in Definition 6.5.4. General 
relativity is summarized in this following equation. The presence of mass-energy 
deforms spacetime according to 


G+Ag= er in SI units, (7.79) 


where G = 6.67 1071! m’s~?kg~* is the gravity constant and A is the cosmological 
constant. If we assume that empty (devoid of energy) spacetime is flat, then 


G=-—T. (7.80) 


Equation (7.80) is called collectively the Einstein field equations (EFE), and the 
formulas in (7.79) are the Einstein field equations with cosmological constant. These 
equations are as important in astrophysics as Newton’s second law of motion is in 
classic mechanics. Though we do not give the calculation here, the constant 87G/c4, 
called Einstein’s constant is chosen so that (1) when the gravitational field is weak 
and (2) velocities are small compare to the speed of light, the theory reduces to the 
Newtonian theory of gravitation in approximation. 

In Equation (2) of Einstein’s original paper on general relativity [19], Einstein 
made the assumption that G vanishes when spacetime is empty of mass-energy. 
This corresponds to the mathematical assumption that A = 0. However, Equation 
(7.80) predicts a dynamic universe. This result did not appeal to Einstein and, 
at the time, there existed no astronomical evidence to support this. In 1917, he 
introduced the constant A because it allows for a static universe. Physically, A 4 0 
would imply the presence of an otherwise unexplained force that counteracts gravity 
or a sort of negative pressure. 

When Hubble discovered that the universe is expanding, the cosmological con- 
stant no longer appeared to be necessary and many physicists did away with it. In 
fact, in his autobiography, George Gamow relays that Einstein told Gamow that 
he considered the introduction of the cosmological constant as “the biggest blunder 
of my life.” [26] However, the possibility of a small nonzero A has resurfaced and 
regularly enters into the conversations around the current most vexing problems 
in physics, namely, the nature of dark energy and the effort to unify gravity and 
quantum mechanics. 

We should note the Einstein field equations (EFE) are very complicated. Finding 
a solution to the EFE means finding the metric tensor g that satisfies (7.80), which 
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consists of 10 second-order, nonlinear, partial differential equations of 10 functions 
gij(x°, x, x”, v), with 0 <i < j < 3. Then, determining the trajectory of a particle 
or of radiation amounts to determining the geodesics in this metric. Surprisingly, 
under some circumstances, especially scenarios that involve a high level of symmetry, 
it is possible to provide an exact solution. 

Whole books have been written about consequences of solutions to (7.80) or 
(7.79) that deviate from Newtonian mechanics. After the introduction of general 
relativity, some scientists balked at such mathematical complexity. Yet experiment 
has repeatedly confirmed predictions in favor of general relativity over Newtonian 
mechanics. The approach taken here to justify the Einstein field equations cited 
mathematical esthetics, the condition of being divergence-free, and linear in the 
components of the curvature tensor R. During the 20th century, scientists arrived 
at (7.79) from other, more physical principles. Of note, in [36], Lovelock showed 
that the Einstein field equations arise as the unique second-order equations that can 
follow from the Euler-Lagrange equations of a Lagrange density involving g,;; and its 
derivatives up to second order. Furthermore, there exist natural generalizations to 
Einstein’s theory of general relativity. However, any theory that can unify gravity 
and quantum mechanics should be able to derive (7.79) as an approximation. 


7.5.3 Schwarzschild Metric 


We finish this section with one of the earliest proven consequences of general rel- 
ativity, i.e., the Schwarzschild metric which is an exact solution to Einstein’s field 
equations. Instead of simply showing that the Schwarzschild metric satisfies EFE, 
we show how the metric was discovered. 

One of the main contexts in which one can expect to see the effects of general 
relativity against Newtonian mechanics is in the context of astronomy. The simplest 
dynamical problem in astronomy involves calculating the orbit of a single planet 
around the sun. One can hope that the EFE for the effect of the sun on the space 
around it will become simple under the following two assumptions (approximations): 


1. The sun is a spherically symmetric distribution of mass-energy density. 
2. Outside of the sun, the stress-energy tensor should vanish. 


The spherical symmetry implies that the components of the metric tensor should be 
given as functions of x° and r alone, where r? = (x')? + (a?)? + (x?)?. Since we are 
looking only for solutions outside the sun, we are looking for solutions in a vacuum. 
Thus T = 0, from which we deduce that G = 0. Thus, Trg G = 0. However, 


1 1 
Trg G = Trg (Re— 5s) =R-5R-4=-R, 


where this follows from (6.56) and the fact that Trg g = dim M = 4. Thus, R=0 
and the fact that G = 0 implies that we are looking for spherically symmetric 
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solutions to the equation 


Since we are looking for solutions in a vacuum, it seems as though we have lost 
information, but, as we shall see, that is not the case. 

The following derivation follows the treatment in [54]. We leave some of the 
details as exercises for the reader. 

A judicious choice of coordinates and a few coordinate transformations will 
simplify the problem. We first start with the coordinates 


(z°, rae o x*) _ (a? rs 0, ~), 
where Z° = ct for some timelike variable t, 7? = (a1)? + (x)? + (x)? and where 6 
and are given in the physics style of defining spherical coordinates, i.e., so that y is 
the longitudinal angle and @ is the latitude angle measured down from a “positive” 
vertical direction. We know that the standard line element in spherical coordinates 
is 
ds? = dr? + 7°d0? + 7 sin? Ody”. 


Though we are not working with the Euclidean metric, spherical symmetry does 
imply that the metric tensor in the space coordinates is orthogonal and that no 
perpendicular direction to the radial direction is singled out. Thus, the metric 
tensor has the form 


goo(F",7) gor (2°.7) 9o2(Z", 7) go3(Z°, 7) 
| gi0(@°,7) gi (Z°, 7) 0 0 
cal g20(Z°, 7) 0 ica 0 ; (7.82) 
930(£°, 7) 0 0 f(#°,7)? sin? 6 


where f is any smooth function. We actually have some choice on @ and y because 
they are usually given in reference to some preferred x-axis and z-axis. We choose 
6 and » (which may change over time with respect to some fixed Cartesian frame) 
so that g2o0 = g30 = 0, and then the metric looks like 


Joo Oe r) gor (2, r) 0 0 
DoT. ees Gl 0 0 
do6= oe ) a ) f(@°,7)? 0 (7.83) 
0 0 0 f(z, 7)? sin? 0 


We make the coordinate transformation r = f(Z°,7) and all the other coordinates 


remain the same. In this coordinate system, the metric looks like 


go0(2"s1) gor(2",1) 0 0 
_ | mo(@,r) gis (@",r) 5 0 (7.84) 
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Finally, we can orthogonalize the metric tensor by a suitable coordinate transfor- 
mation of 2° = ct = h(Z°,r), with r staying fixed (see Problem 7.5.4). Since we 
know that the metric has signature (3, 1), we can write the metric in the coordinate 
system (x°, 7,6, ~) as 


es") 90 0 
0 er(@*.r)  Q 0 
aS 7.85 
nae 0 0 rr 9 (ee) 
0 0 0 rsin? 6 


where \ and v are smooth functions. The metric in Equation (7.85) is an orthogonal 
metric that is spherically symmetric in the space variables. 

Using the notation t = 0u/0x° and u’ = 0u/Or, we can show (see Problem 7.5.5) 
that the independent nonzero Christoffel symbols for the Levi-Civita connection are 


1 1 1. 1 

r= a r= BY ro, = en", Ty = sve, (7.86) 
1. 1 

lo. = 5 Thy = ae Mn=—re*, Vi,=-—rsin?@e-*, (7.87) 
1 1 

Tin = Pe [33 = —sin@cos0, Ty3 = Fe P33 = cot 0. (7.88) 


Though it is a little long to calculate (see Problem 7.5.6), we then determine that 
the only nonzero components of the Ricci tensor are 


yp” y)\? pn pf dN 2 MD 
Roo =e” (5 mm + ) 


4 4 r 2 4 4’ 
i 
Ror = Rio = —, 
; 
" 1\2 1\l aU \ 2 M (7.89) 
pepe ee ee | ! -), 
2 4 4 r 2 4 4 


Ro = (1 + 5 = x) +1, 
R33 = sin? 0 Roo. 
Since we are trying to solve Rag = 0, we obtain conditions on the functions A 
and v. Since Ro, = 0, we deduce immediately that 4 = 0, which means that is a 


function of r alone. Also, since 0R22/0t = 0, we find that Ov’/Ot = 0. Therefore, 
we can write the function v as 


y=v(r) + f@) 


for some function f(t). 
We now make one final coordinate change. In the metric line element, ¢ appears 
only in the summand e’d(x°)? = e”(efd(ct)?. So by choosing the variable ¢ in 
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such a way that - 

dt _ ofO/2 

dt 
and then renaming ¢ to just t, we obtain a metric which is independent of any 
timelike variable. (The variable f, relabeled as t, is not necessarily time anymore so 
we cannot necessarily call a solution with t = 0 as a solution static.) 

We can now assume there is no t dependence. Simplifying the expression Rog + 

e’-R11 leads to 


1 
=(A' + ') =0. 
# 


This implies that \(r) = —v(r)+C for some constant C’. Without loss of generality, 
we can assume that C = 0 since we have not specified A or v. Thus, we set 
A(r) = —v(r). Then R22 = 0 in (7.89) implies that 


e (1 = rd’) =; 
Now setting h(r) = e~*“”), this last equation becomes 


peeoe 
ror 


This is a linear, first-order, ordinary, differential equation whose general solution is 


Miser Oia 
rn 
where M is a constant of integration. One can verify directly that Ry; = Rog = 0 
in Equation (7.89) are satisfied by this solution and therefore give no additional 
conditions. Therefore, the spherically symmetric vacuum solution to the EFE gives 
a metric with line element 


-1 
ds? = — (1 = =) edt? + (1 = *) dr? + r7d? + r? sin? Ody. (7.90) 
This is called the Schwarzschild metric. This metric provided the first exact solution 
to the Einstein field equations. Though it is still complicated, this metric can be 
compared in fundamental importance to the solution in mechanics to the differential 
equations d?z/dt? = —mg. Many of the verifiable predictions of general relativity 
arise from this metric. 

In order to understand (7.90), we need to have some sense of the meaning of the 
constant M. Obviously, if 17 = 0, then the Schwarzschild metric is simply the flat 
Minkowski metric for spacetime. 

To derive an interpretation for M #4 0, we study some consequences of (7.90) 
for small velocities. If the velocity v is much smaller than the speed of light, i.e., 
v <c, then special relativity tells us that proper time is approximately coordinate 
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time T & t = x°/c. Furthermore, from (7.65) we can approximate the velocity of 
any particle as U = (c,0,0,0). Plugging these into the geodesic equation 


d?x' a , dx da* 
dr2 8 dr dr’ 
we obtain the approximate relationship 
d?x* i om 1 O9o0 
de = —Tige? — ao Sak ; (7.91) 


where the second equality follows from the formula for Christoffel symbols and the 
fact that the functions g;; are not x° dependent. However, since g is diagonal and 
goo depends only on r, we find that the only nonzero derivative is 


Pr _ 5 (1 om) (1 eM) = He =) 


dt? 2 r Or r r2 r 


We must compare this to the formula for gravitational attraction in Newtonian 


mechanics, namely, 

ar -_ GMs 

dt re 
where Ms is the mass of the attracting body (and G is the gravitational constant). 
Thus, we find as a first approximation that the constant of integration M is 


GMs 


zy 


Me (7.92) 


c 
Hence, M is a constant multiple of the mass of the attracting body. The constant 
2M has the dimensions of length, and one calls rg = 2M the Schwarzschild radius. 
The formula for it is 


2G 
rag = y+ Ms = (1.48 x 10-7” m/kg) Ms. 


The Schwarzschild radius is 2.95km for the sun and 8.8mm for the Earth. Evi- 
dently, for spherically symmetric objects that one encounters in common experience, 
the Schwarzschild radius is much smaller than the object’s actual radius. In fact, if 
a spherically symmetric object has radius Rs and mass Msg, then 


2G 
ra<Rs = ae < Rg. 


A sphere with rg > Rg would need to have an enormous density. Furthermore, 
this situation would seem to be physically impossible for the following reason. It is 
understood that the Schwarzschild metric holds only in the vacuum outside of the 
body (planet or star). However, if rg > Rg, then the Schwarzschild radius would 
correspond to a sphere outside of the spherical body where the Schwarzschild metric 
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has a singularity gi; = 1/0. For this reason, some physicists initially claimed this 
to be a result of the successive approximations or simply a physically impossible 
situation. 

The history of science has occasionally shown that singularities in the equations 
do not immediately imply that the scenario is impossible. The possibility of travel- 
ing at the speed of sound was thought to be impossible because of the consequences 
for the Doppler effect equation. Now, military jets regularly fly faster than the speed 
of sound. Similarly, in recent decades, physicists regularly study objects considered 
to be so dense that 2GMs/c? > Rg. Such objects are called black holes. For a 
time, the existence of black holes remained in the realm of hypothesis, but now as- 
tronomers are convinced they have observed many such objects, and astrophysicists 
have worked out many of their dynamic properties. 


PROBLEMS 
7.5.1. Show that we can rephrase the explanation for Equation (7.68) by saying that for 
any three vectors A, B, and C in TpM, 
(smp)(A, B,C) = Vole(ni, A, B,C), 


where Vol, is the volume form with respect to the metric g. 


7.5.2. Let L be a symmetric tensor of type (0,2) consisting of components that are con- 
structible from those of R and Tg and are linear in the components of the Riemann 
curvature tensor R. 


(a) Show that L can only have the form 
Lag = aRag + bRgas + AGJap, 


where Rag are the components of the Ricci curvature tensor, R is the scalar 
curvature, and a, 6, and X are real constants. [Hint: Consider Bianchi 
identities. | 


(b) Show that div L = 0 if and only if b = —4a. 
(c) If g = n, the standard Minkowski metric, show that L = 0 if and only if 
A=0. 
7.5.3. Calculate the curvature tensor and the Ricci curvature tensor for the Schwarzschild 
metric. 


7.5.4. Find the “suitable” coordinate transformation h that allows one to pass from Equa- 
tion (7.84) to Equation (7.85). 


7.5.5. Prove that Equation (7.87) is correct. 
7.5.6. Prove Equation (7.89). 


7.5.7. Light Propagation in the Schwarzschild Metric. In the Schwarzschild metric, light 
travels along the null-geodesics, i.e., where ds” = 0. 


(a) Explain why setting @ = 0 does not lose any generality to finding the null- 
geodesics. 
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(b) 


(c) 
(d) 
(e) 


r= Rg/sing 


Figure 7.12: Deviation of light near a massive body. 


Prove that if one sets u = 1/r, then ds? = 0 implies that 


2 
aa +u= 3M. (7.93) 


Deduce that in the vicinity of a black hole, light travels in a circle precisely 
at the radius r = arg. 


Solve Equation (7.93) for M = 0. This corresponds to empty space (no mass 
present). Call this solution uo(y). 


Now look for general solutions u to Equation (7.93) by setting u = ui + wo. 
Then ui(y) must satisfy 


uy i = 3M sin*( ) 
dip2 T a Ro 2) Yo). 


Solve this differential equation explicitly, and find the complete solution to 
Equation (7.93). 


In the complete solution, show that M = 0 (empty space) corresponds to 
traveling along a straight line u = ae sin(~ — Yo), where Ro is the distance 
from the line to the origin. 


Show that the general solution to Equation (7.93) is asymptotically a line. 


We now consider Eddington’s famous experiment to measure the deviation 
of light by the sun. Consider a geodesic G in the Schwarzschild metric 
that passes right alongside the sun, i.e., passes through the point r = Rs 
and y = 0. Define Yoo as the limiting angle of deviation between the line 
r = Rg/ sing and the geodesic G (see Figure 7.12). The sun bends the light 
away from the straight line by a total of 2y... Using (at a judicious point) 
the approximation that sin y & y, prove that the total deviation of light is 


_ 4GMs 


Poo Rgc? ’ 


where Mz is the mass of the sun and Rg is the sun’s radius. 
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7.5.8. Consider the metric with line element ds? = —e?**dt? + dx? + dy? + dz’. 


(a) From the geodesic equation associated to this metric, show that at ev- 
ery point in this spacetime, a free particle experiences an acceleration of 
d’a/dr? = —a. 


(b) Show that the only nonzero Christoffel symbols for this metric are '{y = 
Te, =a and Po = ae?™. 


(c) Show that the only nonzero components of the (0,4) curvature tensor are 
Roioi and its permutations. Calculate Ro1io01. 


(d) Find the Ricci curvature tensor and notice that it is diagonal. 


APPENDIX A 


Point Set Topology 


Though mathematicians, when developing a new area of mathematics, may define 
and study any object as they choose, the “natural” notion of a surface in R® requires 
a rather intricate definition (Definition 3.1.1). Though at first somewhat unwieldy, 
this definition and also the definition for a differentiable manifold are necessary to 
appropriately generalize calculus and geometry to non-Euclidean spaces. 

On the other hand, numerous concepts from geometry and calculus can be gen- 
eralized not by formulating more constrained definitions but by expanding the con- 
text in which we define these concepts. The first wider context presented in this 
appendix is that of a metric space, a set equipped with some notion of distance. 
Many concepts from Euclidean geometry, including continuity, have natural gen- 
eralizations to metric spaces. As it turns out, many useful concepts for analysis, 
like continuity, limit of sequences, or connectedness, arise in the yet more general 
context of topological spaces, where instead of a distance function, we have a looser 
notion of “nearness.” 

Though topology is a vast branch of mathematics, this appendix presents just 
the basic notions that support this book’s presentation of differential geometry. A 
reader might encounter many of these concepts in a typical analysis course. We 
refer the reader to [27] for a gentle but thorough introduction to point set topology 
and to [43] and [2] for an introduction to topology that includes homology, the 
fundamental group, algebraic topology, and the classification of surfaces. 


A.1 Metric Spaces 
A.1.1 Metric Spaces: Definition 


A metric space is a set that comes with the notion of “distance” between two 
points, where this distance function involves a few numerical conditions that mimic 
geometry in Euclidean spaces. 


Definition A.1.1. Let X be any set. A metric on X is a function D : X x X — R2° 
such that 
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1. equality: D(z, y) = 0 if and only if x = y; 

2. symmetry: D(x,y) = D(y, x) for all x,y € X; 

3. triangle inequality: D(x, y) + D(y, z) > D(a, z) for all z,y,z € X. 
A pair (X, D) where X is a set with a metric D is called a metric space. 
Example A.1.2 (Euclidean Spaces). The Euclidean space R” is a metric space 


where D is the usual Euclidean distance formula between two points, namely, if 
P= (p1,P2, f5i9 yD) and Q =~ (q1, 92; sane eons then 


D(P,Q) = 


Many notions in usual geometry (circles, parallelism, midpoint...) depend vitally 
on this particular distance formula. Furthermore, if n = 1, this formula simplifies 
to the usual distance formula on the real line R, namely, 


To prove that (R”, D) is indeed a metric space, we must verify the three axioms 
in Definition A.1.1. The first holds because 


D(P,Q) =0 <=> S°(G — pi)? =0, 


i=l 


which is equivalent to (q; — pi)? = 0 for all 1 < i < n, and hence q = p; for all 
1<i<n. The second obviously holds, and we prove the third axiom as follows. 
The Cauchy-Schwarz inequality on the vectors PQ and QR gives 


PQ- QR < PG-QR| < ||PGI| ||ORl| 


so 


2S (ai — piri — 4) S24] 9 (ai — Ds)?,| So (ri — )?. 


i=l i=l i=l 


Using the property that 2ab = (a + b)? — a? — b? with a = q — pj andb=r; — 4G, 
we get 
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from which it follows that D(P,Q) + D(Q, R) > D(P, R). 

Of course, the triangle inequality is used in Definition A.1.1 precisely because it 
is one of the fundamental properties of the Euclidean distance function. However, we 
needed to verify the triangle inequality based on the formula given for the Euclidean 
metric, and the example illustrates what is required in order to establish the three 
axioms. 


Example A.1.3. There exists a variety of other metrics on Euclidean space, and we 
illustrate a few of these alternate metrics for R?. Let P = (x1, y1) and Q = (a2, y2). 
We leave to the reader the proofs that the following functions are metrics on R?: 


D, (P,Q) = |x2 — #1| + |y2 — yal, 
D3(P,Q) = ¥/|e2 — 21)? + |y2 — wl, 
Doo (P,Q) = max {|z2 — 21], ly2 — y1|}. 


Example A.1.4 (Six Degrees of Kevin Bacon). A humorous example of a metric 
space is the set of syndicated actors A equipped with the function D defined as 
follows. Consider the graph whose set of vertices is A and has an edge between 
two actors a; and ag if they acted in a movie together. Define D(a1,a2) as 0 if 
a, = dq and, otherwise, as the minimum number of edges it takes to create a path 
connecting a; and az. The pair (A, D) is a metric space. 

The party game called “Six Degrees of Kevin Bacon” asks players to find D(a, a2) 
given any pair (a1, a2). 


Having a notion of distance in a set, we may want to consider the subset of all 
points that are within a certain distance of a fixed point. 


Definition A.1.5. Let (X,D) be a metric space, and let p € X be a point. We 
define the open ball of radius r around p as the set 


B,(p) = {y € X | D(p,y) <r}. 


The reader who is new to topology should note that the terminology “open ball” 
might be initially misleading since the set B,.(p) only takes the shape of an actual 
ball (disk, sphere, etc.) in the case of the Euclidean metric on R”. 


Example A.1.6. Consider the metric D, from Example A.1.3 above, and let O = 
(0,0). The ball of radius 1 around the origin O using the metric Dj is the set 


By(O) = {(x,y) € R* | |z| + |yl < 1}. 


Notice that the equation |xz|+|y| = 1 has a locus that is symmetric about the x-axis 
and about the y-axis. So to determine its locus we only need to see what happens 
in the first quadrant. In the first quadrant, the equation |x| + |y| = 1 becomes 
x+y =1, which is a line segment from (1,0) to (0,1). Thus, the open ball By(O) 
is the open square with corners at {(1,0), (0,1), (—1,0), (0, —1)}. 
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b 


Figure A.1: Example A.1.7: a collar around f(z). 


Example A.1.7. Metric spaces can encompass a much wider range than the above 
examples have illustrated so far. Let X = C°([a,b]) be the set of continuous real 
functions defined on the closed interval [a,b], or let X = Fhoundea([a, 6]) be the set 
of all bounded functions on interval [a,b]. (A theorem of calculus tells us that any 
function f continuous over [a, b] is bounded so C°([a, b]) C Fyoundea([a, }]).) Define 
the function D : X x X + R?° as 


D(f,g) = lub{|g(x) — f(x)| + x € [a, bf, 


where lub refers to the least upper bound of a subset of reals. 
The open ball of radius r around a function f is the set of all the functions 
g € X such that |f(«) — g(x)| <r for all x € [a, 6], or in other words, 


f(a) —r < g(a) < f(x) +r for all x € [a, 6]. 


In this context, we call the region f(x) —r < y < f(a) +r with a < « < b the 
r-collar of f(x). See Figure A.1. 


Above, we introduced the notion of an open ball in any metric space. Though 
this appendix introduces notions that primarily support an overview of point-set 
topology, the notion of distance between two points allows us to generalize concepts 
from geometry to any metric space (X,D). We list here below a few of these 
concepts, which exist in any metric space but do not generalize further to topological 
spaces. 


e A point C € X is said to be between two points A and B if D(A, B) = 
D(A,C) + D(C, B). When D is the Euclidean metric on R”, this equality 
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occurs for a degenerate triangle and only in the case when C’' lies on the 
segment AB. It is in this sense that this definition directly generalizes the 
notion of betweenness from Euclidean geometry. 


The bisector of two points A and B is the set of points 
{Pe X|D(P, A) = D(P, B)}. 


This is the usual definition for the segment bisector in Euclidean geometry 
but in other metric spaces this set may look quite different. 


If A,B € X and cE R with 2c > D(A, B), then the set of points 
{M € X| D(A, M) + D(B, M) = 2c} 
is the ellipse with foci A and B and with half axis c. 


Let S Cc X be a subset. We define the diameter of S to be 


diam S = lub{D(az,y)|a,y € S}. 


A subset S of X is called bounded if diam S' < oo. 


Let S; and Sj be two subsets of the metric space X. Then the distance 
between 5S; and Sp» is 


D(S1, S2) = glb{D(z, y) |x € Si, y € So}, 


where glb is the greatest lower bound. The distance between a point x € X 
and a subset A C X is D({x}, A). We observe that this definition of distance 
between subsets does not establish a metric on P(X), the set of subsets of X. 
Indeed, for any two subsets S, and Sy in X such that S; # Sp and $;NS_2 4 0, 
the distance between them is D(S1,.S2) = 0, and hence, even the first axiom 
for metric spaces fails. However, in geometry, the notion of distance between 
sets, especially disjoint sets, is quite useful. 


A.1.2. Open and Closed Sets 


In the study of real functions, we often use the notions of open intervals and closed 
intervals. In this context, we simply say that a bounded interval is open if it does 
not include its endpoints and closed if it includes both of them; a similar definition 
is given for an unbounded interval. Then a subset of R is called open if it is a 
disjoint union of open intervals. In contrast, in R” or in a metric space, given the 
wide range of possibilities for the shape of sets, we cannot legitimately talk about 
endpoints, though we could attempt to make sense of the concept of “including its 
boundary points.” Regardless, a different definition for openness and closedness is 
required. 
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Definition A.1.8. Let (X,D) be a metric space. A subset U C X is called open 


if for all p € U there exists r > 0 such that the open ball B,(p) C U. A subset 
def 


F CX is called closed if the complement F° = X — F is open. 

Intuitively, this definition states that a subset U of a metric space is called open 
if around every point there is an open ball, perhaps with a small radius, that is 
completely contained in U. Note that we may wish to consider more than one 
metric at the same time on the same set X. In this case, we will refer to a D-open 
set. 


Proposition A.1.9. Let (X,D) be a metric space. Then 
1. X andQ are both open; 


2. the intersection of any two open sets is open; 


3. the union of any collection of open sets is open. 


Proof. For part 1, if p € X, then any open ball satisfies B,.(p) C X. Also, since 0 
is empty, the criteria for openness holds trivially for 9. 

To prove part 2, let U; and U2 be two open sets and let p € U; N U2. Since Uj; 
and U2 are open, there exist r; and r2 such that B,,(p) C U; and B,,(p) C Us. 
Take r = min(r),7r2). Then B,(p) C B,,(p) C U; and B,(p) C B,.(p) C U2 so 
B,(p) CU, NU. Thus, U;N U2 is open. 

Finally, consider a collection of open sets U, where qa is an index taken from 
some indexing set J, which is not necessarily finite. Define 


eee ae 
ael 


For any p € U, there exists some ag € I such that p € U,,. Since Ua, is open, there 
exists r such that B,(p) C U,,, and thus, B,(p) C U. Consequently, U is open. 


Using Proposition A.1.9(2), it is easy to show that any intersection of a finite 
number of open sets is again open. In contrast, part 3 states that the union of any 
collection of open subsets of X is again open, regardless of whether this collection 
is finite or not. This difference between unions and intersections of open sets is 
not an insufficiency of this proposition but rather a fundamental aspect of open 
sets in a metric space. In fact, as the following simple example shows, the infinite 
intersection of open sets need not be open. 


Example A.1.10. For each integer n > 1, consider the open intervals I, = (0, 1+ 
+), and define 


n=1 


Obviously, In41 € In, and so the intervals form a decreasing, nested chain. Since 


= 


limp —oo 4 = 0, we expect S to contain (0,1), but we must determine whether it 
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Figure A.2: Example A.1.11. Figure A.3: Example A.1.12. 


contains anything more. If r > 1, then if n is large enough so that 4 <r—1, we 
have r ¢ I,. On the other hand, for all n € Z2}, 4 >0,sol<1+4+ 1. Hence, 
1 € I, for all n € Z2!, and thus, 1 € S. Thus, we conclude that S = (0,1). This 
shows that the infinite intersection of open sets need not be open. 


Example A.1.11. As a more down-to-earth example, we wish to show that ac- 
cording to this definition, the set S = {(z,y) € R?|0 <a <1land0<y< Il}is 
open in R? equipped with the Euclidean metric. Let p = (x9, yo) be a point in S. 
Since p € S, we see that zo > 0, 1-— 20 > 0, yo > 0, and 1 — yo > O. Since the 
closed distance from a point p to any line L is along a perpendicular to L, then the 
closest distance between p and any of the lines x = 0, x = 1, y = 0, and y = 1 is 
min{xo,1— 20, yo, 1 — yo}. Consequently, if r is any positive real number such that 


r <min{zo,1— 29, yo, 1 — yo}, 


then B,(p) C S. (See Figure A.2.) 


Example A.1.12. In contrast to the previous example, consider the set 
T ={(2,y) € R?|0<2<land0<y< 1}, 


where again we assume R? is equipped with the Euclidean metric. The work in 
Example A.1.11 shows that for any point p = (2,Yo), with 0 < a < 1 and 
0 < yo < 1, there exists a positive radius r such that B,(p) C S Cc T. Thus, 
consider now points p € T with coordinates (0, yo). For all positive r, the open ball 
B,(p) contains the point (—r/2, yo), which is not in J. Hence, no open ball centered 
around points (0, yo) is contained in T, and hence T’, is not open. (See Figure A.3.) 


It is very common in proofs and definitions that rely on topology to refer to an 
open set that contains a particular point. Here is the common terminology. 
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Definition A.1.13. Let p be a point in a metric space (X, D). An open neighbor- 
hood (or simply neighborhood) of p is any open set of X that contains p. 


Closed sets satisfy properties quite similar to those described in Proposition 
A.1.9, with a slight but crucial difference. 


Proposition A.1.14. Let (X,D) be a metric space. Then 
1. X and 9 are both closed; 


2. the union of any two closed sets is closed; 


3. the intersection of any collection of closed sets is closed. 


Because a set is defined as closed if its complement is open and because of 
DeMorgan laws for sets, this proposition is actually a simple corollary of Proposition 
A.1.9. Therefore, we leave the details of the proof to the reader. 

Note that in any metric space, the whole set X and the empty set — are both 
open and closed. Depending on the particular metric space, these are not necessarily 
the only subsets of X that are both open and closed. 


Proposition A.1.15. Let (X,D) be a metric space, and let x € X. The singleton 
set {x} ts a closed subset of X. 


Proof. To prove that {x} is closed, we must prove that X — {a} is open. Let y be 
a point in X — {x}. Since x 4 y, by the axioms of a metric space, D(z, y) > 0. Let 
a 4 D(z, y). The real number r is positive, and we consider the open ball B,.(y). 
Since D(x, y) > r, then « ¢ B,(y), and hence, B-(y) C X — {x}. Hence, we have 
shown that X — {x} is open and thus that {a} is closed. 


The notion of distance between sets provides an alternate characterization of 
closed sets in metric spaces. Recall that for any subset A C X, x € A implies that 
D(a, A) = 0. The following proposition shows that the converse holds precisely for 
closed sets. 


Proposition A.1.16. Let (X,D) be a metric space. A subset F is closed if and 
only if D(x, F) = 0 implies x € F. 


Proof. Suppose first that F’ is closed. If « ¢ F, then « € X — F, which is open, 
so there exists an open ball B,(x%) around x contained entirely in X — F. Hence, 
the distance between any point a € F' and z is greater than the radius r > 0, thus, 
D(a, F) > 0, and in particular, D(x, F) 4 0. Thus, D(a, F’) = 0 implies that x € F. 

We now prove the converse. Suppose that F' is a subset of X such that D(x, F) = 
0 implies that « € F. Then for all c € X — F, we have D(x, F) > 0. Take the 
positive number r = 1 D(z, F), and consider the open ball B,(ax). Let p be any 


2 
point in F' and a any point in B,(x). Form the triangle inequality 


D(p,x) < D(p,a) + D(a, x) => D(p,a) = D(p, x) — D(a, z). 
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The least possible value for D(p,a) occurs when D(p,x) is the least possible and 
when D(a, x) is the greatest possible, that is when D(p,x) = D(x, F) and D(a, x) = 
r. Thus, we find that D(p,a) > r > 0. Hence, for alla € B,(a), we have D(F,a) > 0 
and thus B,(z)N F = @. Therefore, B,(x) C X — F so X — F is open and F is 
closed. 


Proposition A.1.16 indicates that given any subset A of a metric space X, one 
can obtain a closed subset of X by adjoining all the points with 0 distance from A. 
This motivates the following definition. 


Definition A.1.17. Let (X,D) be a metric space, and let A C X be any subset. 
Define the closure of A as 


CLA = {x € X | D(a, A) = 0}. 


Proposition A.1.18. Let (X,D) be a metric space and A any subset of X. ClA 
is the smallest closed set containing A. In other words, 


C1A = () Be 


ACF, F closed 


Proof. (Left as an exercise for the reader. See Problem A.1.20.) 


A.1.3. Sequences 


In standard calculus courses, one is introduced to the notion of a sequence of real 
numbers along with issues of convergence and limits. The definition given in such 
courses for when we say a sequence converges to a certain limit formalizes the idea 
of all terms in the sequence ultimately coming arbitrarily close to the limit point. 
Consequently, since limits formalize a concept about closeness and distance, the 
natural and most general context for convergence and limits is in a metric space. 


Definition A.1.19. Let (X, D) be a metric space and let {2 }nen be a sequence 
in X. The sequence {z,,} is said to converge to the limit ¢ € X if for all e € R*° 
there exists N € N such that ifn > N, then D(a, 0) <  (ie., tn € Bz(€)). If {an} 
converges to @, then we write 

lim ry, = £. 

noo 

Note that we can restate Definition A.1.19 to say that {x,} converges to @ if for 

all positive « € R*°, only finitely many elements of the sequence {2,,} are not in 
the open ball B-(é). 


Example A.1.20. Consider the sequence {2 },>1 in R? given by zp, = (3, as) 7a). 


We prove that {x,} converges to (3,0,2). We know that as sequences of real num- 
bers, 


2 
lim —0 and fine 
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Pick any positive «. Choose N; such that n > N, implies that 4, < -S, and 


n+2 V2’ 
choose Nz such that n > No implies that ait — 2 < Wee Using the Euclidean 
distance 
1 2 2n 2 
D n> ) 2 = ~ 2 (—; a ( ~ 2) ’ 
(an, (3,0, 2)) /e 3)2 + eS 0)" + ar 
one sees that ifn > N = max(Nj, N2), then 
E€ € 
D(an, (3, 0,2 ~+-—=e. 
(tn, (3, 0,2)) < a5 E 


This proves that lim x, = (3, 0,2). 

Note that we could have proved directly that limz, = (3,0,2) by considering 
the limit of D(a, (3,0,2)) as a sequence of real numbers and proving that this 
converges to 0. 


Example A.1.21. Consider the set X of bounded, real-valued functions defined 
over the interval [0,1] equipped with the metric defined in Example A.1.7. For 
n > 1, consider the sequence of functions given by 


rae, 1—nz, for0<a <i, 
nv) = 
0, for + <a@<1. 


Figure A.4 shows the functions for n = 1, 2,3. One might suspect that the limit of 
this sequence f,,(a2) would be the function 


1, ifg=0 
ne={f or 


for x > 0, 


but this is not the case. Let r = +, and consider the r-collar around f(z). There 


ve) 

is no n such that f,(x) lies within the ;-collar around f(x). (See Figure A.5.) 
Consequently, f,(x) does not converge to f(a) in the metric space (X,D). Note, 
however, that for all x € [0,1], as sequences of real numbers limy-+oo fn(x) = f(x). 


We say that f,(a) converges pointwise. 


Proposition A.1.22. Let (X,D) be a metric space. Any sequence {xp} can con- 
verge to at most one limit point. 


Proof. Suppose that 


lim zy, = é and lim xz, = &. 
noo noo 


Let € be any positive real number. There exists N; such that n > N, implies that 
D(an,) < §, and there exists Nz such that n > Ng implies that D(a, 0’) < §. 
Thus, taking some n > max(N,, Nz), we deduce from the triangle inequality that 


D(l,€) < D(n,0) + Dont) S 5 +5 =e. 
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fn(2) 


\ 


Figure A.4: Sequence of functions. Figure A.5: }-collar around f(z). 


Thus, since D(£, ¢’) is less than any positive real number, we deduce that D(é, £’) = 0 
and hence that ¢ = ¢’. 


In any metric space, there are plenty of sequences that do not converge to any 
limit. For example, the sequence {a,,}n>1 of real numbers given by a, = (—1)"++4 
does not converge toward anything but, in the long term, alternates between being 
very close to 1 and very close to —1. Referring to the restatement of Definition 
A.1.19, one can loosen the definition of limit to incorporate the behavior of such 
sequences as the one just mentioned. 


Definition A.1.23. Let (X,D) be a metric space, and let {x,,} be a sequence in 
X. A point p € X is called an accumulation point of {x,} if for all real « > 0, an 
infinite number of elements x, are in B-(p). The accumulation set of {x,} is the 
set of all accumulation points. 


Example A.1.24. Consider again the real sequence a, = (—1)" + +. Let € be any 
positive real number. If n > 4 and n is even, then a, € B.(1). Ifn > 4 and n is 
odd, then a, € B:(—1). Hence, 1 and —1 are accumulation points. However, for 
any r different than 1 or —1, suppose we choose a ¢ such that ¢ < min(|r—1|, |r+1)). 
If n is large enough, then 


1 
ae | min(|r — 1, |r +1) —e|, 
n 


and for such n, we have a, ¢ B-(r). Thus, 1 and —1 are the only accumulation 
points of {a,,}. In the terminology of Definition, A.1.23, the accumulation set is 


ii 
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A.1.4 Continuity 


For the same reason as for the convergence of sequences, the notion of continuity, 
first introduced in the context of real functions over an interval, generalizes naturally 
to the category of metric spaces. Here is the definition. 


Definition A.1.25. Let (X,D) and (Y,D’) be two metric spaces. A function 
f : X —Y is called continuous at a € X if for all ¢ € R*°, there exists 6 € R*° 
such that D(«,a) < 6 implies that D(f(«), f(a)) < ¢. The function f is called 
continuous if it is continuous at all points a € X. 


Example A.1.26. As a first example of Definition A.1.25, consider the function 
f :R? > R given by f(z, y) = x+y, where we assume R? and R are equipped with 
the usual Euclidean metrics. Consider some point (a1,a2) € R?. Let ¢ > 0 be any 


positive real number. Choosing 6 = 5 will suffice, as we now show. First note that 


D((e,y),(a1,42)) = V(@= a)? + y= aa? < 5 


implies that 


Je—a|<= and jy—a| <5 
r—-a a. n —-a irae 
ae eae ae 
But if this is so, then 
é € 
|f(v,y) — f(ai, a2)| = |e + y — (a1 + a2)| < |x — ai] + |y — aa] < 5 5 =e 


Thus, f is continuous. 


Example A.1.27. Definition A.1.25 allows one to study the continuity of functions 
in much more general contexts, as we show with this example. Let X be a proper 
subset of R”, and let p be a point in R” — X. We view X as a metric space by 
restricting the Euclidean metric to it. Let S’~! be the unit sphere in R” also with 
its metric coming from the Euclidean one in R”. Define a function f : X > S"~! 
by a 

. Ep 

OTE 


We will show that f is continuous. 
Let @ € X. The Euclidean metric is D(#, a) = ||% — a||. Hence 


DF), ®)=|| ea Fecal 
© EE EE za) 


I|é — pl |e — all 
= V2 —2cosa, 


= fe-2 »(4—B)-(@-) 
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where a is the angle between the vectors (4 — p) and (# — p). However, from the 
trigonometric identity sin? @ = (1 — cos 20)/2, we deduce that 


D(f(@), f(@)) = 2sin (S). 


If # and @ are close enough, then (@—p) and (%—p) form an acute angle, and hence, 
if d is the height from d@ to the segment between p and Z, we have 


d lz — al 


"eal eal 


D(f(2), f(@)) = 2sin (5) <2sina = 


Therefore, choosing 6 small enough so that the angle between (@ — p) and (# — p) 
is acute and 6 < $||~— @|| c, we conclude that 


|Z — al] <6 => D(f(#), f(@)) <¢, 
proving that f is continuous at all points dE X. 


Proposition A.1.28. Let (X,D), (Y,D’), and (Z,D"”) be metric spaces. Let f : 
X—> Y andg:Y — Z be functions such that f is continuous at a point a © X 
and g is continuous at f(a) € Y. Then the composite function go f:X > Z is 
continuous at a. 


Proof. Since f is continuous at a, for all c¢, € R*°, there exists 6; € R*° such 
that D(x,a) < 6 implies that D( f(x), f(a)) < e1. Since g is continuous at f(a), 
for all eg € R*®, there exists 62 € R*° such that D(y, f(a)) < 62 implies that 
D(g(y), 9(f(a))) < €2. Therefore, given any ¢ > 0, set 2 = € and choose ¢) so that 
ey< 69. Then 


D(x, a) < 6 > D(f(x), f(a)) <e1 < 62 > D(g(f(z)), 9(F(@))) <e, 


showing that go f is continuous at a. 


Using the concepts of open sets, we can give alternate formulations for when a 
function between metric spaces is continuous. 


Proposition A.1.29. Let (X,D) and (Y,D’) be two metric spaces, and let f : 
X — Y be a function. The function f is continuous if and only if for all open 
subsets U CY, the set 


f-'(U) = {a € X | f(z) € U} 
is an open subset of X. 


Proof. First suppose that f is continuous. Let U be an open subset of Y, and let x 
be some point in f~'(U). Of course f(x) € U. Since U is open, there exists a real 
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€ > 0 such that B.(f(x)) C U. Since f is continuous, there exists a 6 > 0 such that 
y € Bs(x) implies that f(y) € Be(f(x)). Hence, 


f(Bs(@)) C Be(f(@)) CU, 


and thus, B;(x) c f71(U). 
Conversely, suppose that f~'(U) is an open set in X for every open set U in Y. 
Let f(x) be a point in U, and let € be a positive real number. Then 


f-"(Be(f(2))) 


is an open set in X. Since « € f~'(B.(f(x))) is open, there exists some 5 such 


that Bs(x) C f7'(Be(f(z))). Thus, f(Bs(z)) C Be-(f(x)), and therefore, f is 
continuous. 


The following proposition is an equivalent formulation to Proposition A.1.29 but 
often more convenient for proofs. 


Proposition A.1.30. Let (X,D) and (Y, D’) be two metric spaces and let f : X > 
Y be a function. The function f is continuous if and only if for all open balls B,(p) 
in Y, the set f~'(B,(p)) is an open subset of X. 


Proof. (Left as an exercise for the reader. See Problem A.1.27.) 


Much more could be included in an introduction to metric spaces. However, 
many properties of metric spaces and continuous functions between them hold sim- 
ply because of the properties of open sets (Proposition A.1.9) and the characteriza- 
tion of continuous functions in terms of open sets (Proposition A.1.29). This fact 
motivates the definition of topological spaces. 


PROBLEMS 
A.1.1. Prove that D1, D3, and Dx from Example A.1.3 are in fact metrics on R?. 


A.1.2. In the following functions on R? x R?, which axioms fail to make the function into 


a metric? 

(a) Di((x1, ys), (2, y2)) = |xa| + |x2| + |ys| + lyel- 

(b) Do((a1, y1), (2, y2)) = —((w2 — #1)” + (y2 — y1)”). 
(c) Ds((x1, y1), (2, y2)) = |v2 — x1| - |y2 — ya]. 

(d) Da((a1,y1), (w2, y2)) = |e — wi | + |y2 — yi]. 


A.1.3. Let (X1,Di) and (X2, D2) be metric spaces. Consider the Cartesian product 
X = X, x X2. Prove that the following function is a metric on X: 


D((p1, P2), (91, 92)) = D(p1, 41) + D(p2, 42). 
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A.1.4. 


A.1.5. 


A.1.6. 


A.1.7. 


A.1.8. 


A.1.9. 
A.1.10. 


A.1.11. 


A.1.12. 


A.1.13. 


Let X = Pgn(Z) be the set of all finite subsets of the integers. Recall that the 
symmetric difference between two sets A and B is AAB = (A— B)U(B-— A). 
Define the function D: X x X > R° by 


D(A, B) = |AAB\, 


the cardinality of AAB. Prove that D is a metric on X. 


In Euclidean geometry, the median line between two points p: and pz in R? is 
defined as the set of points that are of equal distance from p, and pz, i.e., 


M = {qER?| D(q,p1) = Dg, p2)}. 
What is the shape of the median lines in R? for D1, Dz, and Dx from Example 
A.1.3? 


Prove that if (X,D) is any metric space, then D(x,y)” where n is any positive 
integer, is also metric on X. 


Let (X,D) be a metric space, and let S be any subset of X. Prove that (5, D) is 
also a metric space. (The metric space (S,D) is referred to as the restriction of 
D to S.) 


Let S? be the unit sphere in R’, ice., 
S? = {(2,y,z) €R’ |e? +y’? +2? = 1}. 


Sketch the open balls on S? obtained by the restriction of the Euclidean metric to 
S? (see Problem A.1.7). Setting the radius r < 2, for some point p € S?, describe 
B,(p) algebraically by the equation x? + y? + z? = 1 and some linear inequality 
in x, y, and z. 


Prove that Example A.1.7 is in fact a metric space. 


Consider the metric space (R?, D1), where D, is defined in Example A.1.3. 


(a) Let A and B be two points in R®. Prove that the set of points between A 
and B is the rectangle with vertical or horizontal edges with A and B as 
opposite corners. 


(b) Determine the bisector of the points A and B in this metric. 
Find the distance between the following pairs of sets in R?: 
(a) A={(a,y)|* +y? <1} and B= {(a,y) | (x — 3)? + (y— 2)? < 1}. 
(b) A={(e,y)|ay = 1} and B= {(2,y)| xy =0}. 
(c) A= {(a,y)| 2y = 2} and B= {(a,y)| 2" +y? < J}. 


Prove that a subset A of a metric space (X, D) is bounded if and only if A C B,(p) 
for some r € R7° and pe X. 
Infinite Intersections and Unions. Let An = [n, +00) and let Bn = [+,sinn]. 
Find ee 

(a) U An, (b) () An, () LU Bn. (Aa) 


n=0 n=0 n=2 
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A.1.14. 


A.1.15. 


A.1.16. 


A.1.17. 


A.1.18. 


A.1.19. 


A.1.20. 
A.1.21. 


A.1.22. 


A.1.23. 


A.1.24. 


A.1.25. 


Define the metric D on R? as follows. For any p = (px, py) and q = (dx,qy), let 
D(p,q) = V4(pi — de)? + (py — qy)2. Prove that this is in fact a metric. What is 
the shape of a unit ball B, ((«, y))? What is the shape of the “median” between 
two points? [Hint: see Problem A.1.5.] 


Are the following subsets of the plane (using the usual Euclidean metric) open, 
closed, or neither: 


a,y) Resa? +y? <1}. 


(b) {(2,y) € R? : 2? +y? > 11. 

(c) {(a,y) ER? :e+y=0h. 

(d) {(x,y) © R?:a2+yFA 0h. 
y) 


ER? :a%t+y? <lore=0h. 
x,y) €R?:a?+y? <lora=0}. 
The complement A° where A = {(2,y) € R?|a=0 and —1<y< 1}. 


Let L be a line in the plane R?. Prove that R? — L is open in the Euclidean metric 
and in the three metrics presented in Example A.1.3. 


Let X = R?, and define D2 as the Euclidean metric and D as Dy((x1, yr), (€2, y2)) = 
|v2 — a1| + |y2 — yi|. Prove that any D2-open ball contains a D,-open ball and is 
also contained in a D,-open ball. Conclude that a subset of R? is De2-open if and 
only if it is D,-open. 

Prove that the set {4+|n € Z7°} is not closed in R whereas the set {4|n € 
ZU {0} is. 

Consider a metric space (X,D), and let x and y be two distinct points of X. 
Prove that there exists a neighborhood U of x and a neighborhood V of y such 
that UM V = @. (In general topology, this property is called the Hausdorff 
property, and this exercise shows that all metric spaces are Hausdorff.) 


Prove Proposition A.1.18. 


Let A be a subset of a metric space (X, D). Suppose that every sequence {x,,} in 
A that converges in X converges to an element of A. Prove that A is closed. 


Let {xn }nen be a sequence in a metric space (X,D). Prove that the closure of 
the set of elements {x,,} is {a, |n € N} together with the accumulation set of the 
sequence {x}. 

Using Definition A.1.25, prove that the real function f(x) = 2x — 5 is continuous 
over R. 

Using Definition A.1.25, prove that 


2 


(a) the real function f(x) = x* is continuous over R; 


b) the real function of two variables f(x,y) = 1/(x? + y? +1) is continuous over 
y y 


all R® (using the usual Euclidean metric). 


Let (X,D) and (Y,D’) be metric spaces, and let f : X — Y be a continuous 
function. Prove that if a sequence {z,} in X converges to a limit point @, then 
the sequence {f(a,)} in Y converges to f(£). 
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A.1.26. Let f : X + Y be a function from the metric space (X, D) to the metric space 
(Y, D’). Suppose that f is such that there exists a positive real number \, with 


Di (f(r), flea) & y 


for all 
D@ea) < or all 21 4 22, 


ie., that the stretching ratio for f is bounded. Show that f is continuous. 
A.1.27. Prove Proposition A.1.30. 


A.1.28. Let X = R™ and Y = R” equipped with the usual Euclidean metric. Prove that 
any linear transformation from X to Y is continuous. 


A.2 Topological Spaces 
A.2.1 Definitions and Examples 


Definition A.2.1. A topological space is a pair (X,7) where X is a set and where 
T is a set of subsets of X satisfying the following: 


1. X and @ are in r. 
2. For allU and V int, UNV eT. 


3. For any collection {Ua}aer of sets in 7, the union U,-, Ua is in T. 


The elements in 7 are called open subsets of X and a subset F’ C X is called closed 
if X — F is open, ie., if X —F Er. 


As an alternate terminology we talk about 7 satisfying the above three properties 
as a topology on X. In the introduction to this appendix, we promised that topology 
attempts to provide a mental model that generalizes the notion of nearness. The 
following concept is key to this way of thinking. 


Definition A.2.2. Let (X,7) be a topological space and let x € X. Any U Er 
such that x € U is called a neighborhood of x. 


If we work with more than one topology on the same underlying set X, we refer 
to T-open and r-closed subsets to avoid ambiguity. 

As with the properties of open sets in metric spaces, in criterion (3) of Definition 
A.2.1, the indexing set J need not be countable, and hence, we should not assume 
that the collection {U.}aer can be presented as a sequence of subsets. 


Example A.2.3. According to Proposition A.1.9, if (X, D) is a metric space, it is 
also a topological space, where we use the topology 7 to be the open sets as defined 
by Definition A.1.8. Historically, it was precisely Proposition A.1.9 along with the 
discovery by mathematicians that collections of sets with the properties described 
in this proposition arise naturally in numerous other contexts that led to the given 
definition of a topological space. 
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Example A.2.4 (Euclidean topology). Consider the metric space (R?, D), where 
D is the Euclidean metric. The topology induced on R” according to Example A.2.3 
is called the Euclidean topology. 


Example A.2.5 (Discrete topology). Let X be any set. Setting r = P(X) to be 
the set of all subsets of X is a topology on X called the discrete topology on X. In 
the discrete topology, all subsets of X are both closed and open. 


Example A.2.6 (Trivial topology). On the opposite end of the spectrum from, 
setting 7 = {X,0} also satisfies the axioms of a topology, and this is called the 
trivial topology on X. These two examples represent the largest and the smallest 
possible examples of topologies on a set X. 


Example A.2.7. Let X = {a,b,c} be a set with three elements. Consider the set 
of subsets 7 = {0, {a}, {a,b}, X}. A simple check shows that 7 is a topology on X, 
namely that 7 satisfies all the axioms for a topology. Notice that {a} is open, {c} is 
closed (since {a,b} is open) and that {b} is neither open nor closed. By Proposition 
A.1.15, there is no metric D on X such that the D-open sets of X are the open sets 
in the topology of t. We say that (X,7) is not metrizable. 


It is not always easy to specify a subset of P(X) that satisfy the axioms for a 
topology on X. The concept of a basis makes this possible and Proposition A.2.9 
gives a practical characterization of a basis. 


Definition A.2.8. Let (X,7) be a topological space. A collection of open sets 
B CT is called a basis of the topology if every open set is a union of elements in 
Bcr. 


Proposition A.2.9. Let (X,7) be a topological space, and suppose that B is a basis. 
Then: 


1. the elements of B cover X; 
2. if By, Bo © B, then for all x € Bi Bo there exists Bz € B such that x € 
Bs Cc By Bo. 
Conversely, if any collection B of open sets satisfies the above two properties, then 


there exists a unique topology on X for which B is a basis. (This topology is said to 
be generated by B.) 


Proof. (Left as an exercise for the reader.) 


This characterization allows one to easily describe topologies by presenting a 
basis of open sets. 


Example A.2.10. By Definition A.1.8, in a metric space, the topology associated 
to a metric has the set of open balls as a basis. 
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Example A.2.11. Let X = R?, and consider the collection B of sets of the form 
U = {(x,y) € R?|2 > ao and y > yo}, 


where zg and yo are constants. This collection 6B satisfies both of the criteria in 
Proposition A.2.9, hence there exists a unique topology 7 on R? with B as a basis. 
It is easy to see that 7 is different from the usual Euclidean topology. Note that for 
any open set U € 7, if (a,b) € U, then the half-infinite ray 


{(a+t,b+t)|t > 0} 


is a subset of U. This is not a property of the Euclidean topology on R?, so this 
gives a topology different from the Euclidean topology. 


Proposition A.2.12. Let (X,r) be a topological space. Then the following are true 
about the T-closed sets of X: 


1. X and 0 are closed. 
2. The union of any two closed sets is closed. 


3. The intersection of any collection of closed sets is closed. 


Proof. Part 1 is obviously true since both X and @ are open. 
For part 2, let F, and F be any two closed subsets of X. Then F° and F2° are 
open sets. Thus F°M F2° is open. However, by the DeMorgan laws, 


Pion hS =(MhUR). 


Thus, since F) U f° is open, F U F> is closed. 
For part 3, let {Fi}, where a is in some indexing set I, be a collection of closed 
subsets. The collection {F,°} is a collection of open sets. Therefore, U, Fo“ is 


open. Thus, 
A r= (Ux) 


ael acl 


is closed. 


A converse to this proposition turns out to be useful for defining certain classes 
of topologies on sets. 


Proposition A.2.13. Let X be a set. Suppose that a collection C of subsets of X 
satisfies the following properties: 


1. X and are inC. 
2. The union of any two sets inC is again in C. 
8. The intersection of any collection of sets in C is again in C. 


Then the set of all complements of sets in C form a topology on X. 
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Proof. (Left as an exercise for the reader.) 


Example A.2.14. Let X be any set. Consider the collection C of subsets of X 
that include X, @, and all finite subsets of X. The collection C satisfies all three 
criteria in Proposition A.2.13, so X, @, and the complements of finite subsets of X 
form a topology on X. This is often called the finite complement topology. 


We now present a topology on R” that has more open sets than the finite 
complement topology but fewer than the usual Euclidean topology. 


Example A.2.15. Let X = R”. Let C be the collection of all finite unions of 
affine subspaces of R”, where by affine subspace we mean any set of points that 
is the solution set to a set of linear equations in 71, %2,...,Up (ie., points, lines, 
planes, etc.). Taking the empty set of linear equations or an inconsistent set of 
linear equations, we obtain X and 9 as elements of C. Since the union operation 
of sets is associative, a finite union of finite unions of affine spaces is again just a 
union of affine spaces. 

To establish that the third criterion in A.2.13 holds for C, we must prove that 
the intersection of any collection of finite unions of affine spaces is a finite union 
of affine spaces. Note first that if the intersection of two affine subspaces A; and 
Ag is a strict subspace of both A; and Ag, then dim A; M Ag is strictly less than 
dim A; and dim Ag. Let {Fa}aer be a collection of sets in C, and let {aj}ien bea 
sequence of indices. Given the sequence {a;}, create a sequence of “intersection” 
trees according to the following recursive definition. The tree To is the tree with a 
base node, and an edge for each Aq,,j in 


Fao or J Aco ’ 
ad 


with a corresponding leaf for each Ag,,;. For each tree T;, construct Tj+41 from T; 
as follows. Writing 
Fa, = U Ag,,j ’ 
J 


for each leaf F of T; and for each j such that FM Ay,,; # F’, adjoin an edge labeled 
by Ag,,j to F and label the resulting new leaf by FN Aa,,;- 
As constructed, for each k, the leaves of the tree T;, are labeled by intersections 
of affine spaces so that 
k 
(Fo. 
i=0 


is the union of the leaves of T;,,. Since only a finite number of edges gets added to 
every leaf, for all 2 > 0, there can be only a finite number of vertices at a fixed 
distance from the base node. Furthermore, since any nontrivial intersection AN B 
of affine spaces has dimension 


dim(A NM B) < min(dim A, dim B) — 1 
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and since the ambient space is R”, each branch (descending path) in any tree T; 
can have at most n+ 1 edges. In conclusion, for all T;, there can only be a finite 
number of leaves. Thus, 

Co 

A Fa 

i=0 


is a finite union of intersections of affine spaces. Since this holds for all sequences 
{a;} in J, this then proves that 
[) Fo 


ael 
is a finite union of intersections of affine spaces. 

This rather lengthy proof shows that the collection C of finite unions of affine 
spaces in R” satisfies the criteria of Proposition A.2.13. Consequently, the set + 
of subsets of R” that are complements of a finite union of affine spaces forms a 
topology on R”. 


Example A.2.15, along with the result of Problem A.2.3, shows that given a set 
X, it is possible to have two topologies r and r’ on X such that 7 € 7’, or in other 
words, that every 7-open set in X is r’-open but not vice versa. This leads to a 
useful notion. 


Definition A.2.16. Let X be a set and let 7 and 1’ be two topologies on X. If 
T C 7’, then we say that 7’ is finer than 7 and that 7 is coarser than 7’. If in 
addition tT 4 7’, we say that 7’ is strictly finer than 7 and that 7 is strictly coarser 
than 7’. 


As a few examples, consider X = R”. Then the finite complement topology 
(Example A.2.14) is coarser than the topology defined in Example A.2.15, which is 
in turn coarser than the Euclidean topology. 

In Section A.1.2, Proposition A.1.18 proved that the closure of a subset A (as 
defined by Definition A.1.17) of a metric space (X,D) is the intersection of all 
closed subsets of X containing A. This formulation of the closure of a set does not 
rely explicitly on the the metric D and carries over without changes to topological 
spaces. The closure of a subset A C X is an example of a topological operator, three 
of which we mention here. 


Definition A.2.17. Let (X,7) be a topological space and A a subset of X. We 
define 


1. the closure of A, written ClA, as the intersection of all closed sets in X 
containing A; 

2. the interior of A, written A°, as the union of all open sets of X contained in 
A; 

3. the frontier of A, written Fr A, as the set of all 2 € X such that for every 
neighborhood U of x intersects A and A® nontrivially, ie., UM A # @ and 
UNAS FO. 
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We leave some of the basic properties of the above topological operators to the 
exercises but present one common characterization of closed sets. 


Proposition A.2.18. Let (X,r7) be a topological space, and let A be any subset of 
X. The set A is closed if and only if A= CIA. 


Proof. If A is closed, then A is among the closed sets F' C X that contain A, and 
there is no smaller closed subset containing A. Hence A = C1A. Conversely, if 
A = CIA, then since the intersection of any collection of closed sets is closed, Cl A 
is closed, so A is closed. 


Another useful characterization of closed sets relies not on a topological operator 
but on properties of its limit points. 


Definition A.2.19. Let A be any subset of a topological space (X,7). A limit point 
of A is any point p € X such that UM (A — {p}) ne@ for every open neighborhood 
U of p. 


In the vocabulary of sequences in a metric space, if the subset A is a sequence 
{x,}, then in Definition A.1.23, we would call the limit points of A the accumu- 
lation points of A. For this reason, some authors use the alternate terminology 
of accumulation point for limit points of a subset A in a topological space. The 
discrepancy in terminology is unfortunate, but in topology, the majority of authors 
use the vocabulary of Definition A.2.19. 


Proposition A.2.20. Let A be a subset of a topological space X. The set A is 
closed if and only if it contains all of its limit points. 


Proof. Assume first that A is closed. The complement X — A is open and hence is 
a neighborhood of each of its points. Therefore, there is no point in X — A that is 
a limit point of A. Hence, A contains all its limit points. 

Assume now that A contains all of its limit points. Let p € X — A. Since p is not 
a limit point of A, there exists an open neighborhood U of p such that UM A = 9. 
Thus, X — A is a neighborhood of each of its points, and hence, it is open. Thus, 
A is closed. 


Finally, we mention one last term related to closures. 


Definition A.2.21. Let (X,7) be a topological space. A subset A of X is called 
dense in X if CLA=X. 


By Proposition A.2.20 a set A is dense in X if every point of X is a limit point 
of A. We give a few common examples of dense subsets. 


Example A.2.22. Let I = [a,}] be a closed interval in R, and equip J with the 
topology induced from the Euclidean metric on R. The open subsets in this topology 
on I are of the form UNM J, where U is an open subset of R. Furthermore, the open 
interval (a,b) is dense in I. 
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Proposition A.2.23. The set Q of rational numbers is dense in R. 


Proof. A precise proof of this statement must rely on a definition of real numbers, 
as constructed from the rationals. One may find this definition in any introductory 
analysis text. However, using a high school understanding of real numbers as num- 
bers with an infinite decimal expansion that is not periodic, we can supply a simple 
proof of this fact. 

Let xo € R be any real number, and let U be an open neighborhood of zo. By 
definition, there exists a positive real ¢ > 0 such that (a — €,x%o +¢) € U. Let 
N = 1—|log,,) €|. Consider g the fraction that represents the decimal approximation 
of zo that stops at N digits after the decimal period. Then, g € Qandq € U. Hence, 
Zo is a limit point of Q. 


Knowing that Q is a countable and dense subset of R, we can introduce a 
property of the Euclidean topology that is a key part of the definition of a topological 
manifold. 


Definition A.2.24. A topological space (X,7) is called second countable if there 
exists a basis of 7 that is countable. 


Example A.2.25. The Euclidean space R” is second countable. Consider the 
collection 6 of open balls whose centers have rational coordinates and of rational 
radius. This collection is in bijection with Q” x Q7°. Since the Cartesian product of 
countable sets is countable, B is countable. It is not hard to see that this collection 
satisfies the conditions of Proposition A.2.9, so this 6 is a countable basis of R”. 


A.2.2 Continuity 


When working with topological spaces (X,7) and (Y,r’), we are usually not inter- 
ested in studying the properties of just any function between f : X — Y because 
a function without any special properties will not necessarily relate the topology 
on X to that on Y or vice versa. In Section A.1.4, Proposition A.1.29 provided a 
characterization of continuous functions between metric spaces only in terms of the 
open sets in the metric space topology. This motivates the definition of continuity 
for functions between topological spaces. 


Definition A.2.26. Let (X,7) and (Y,r’) be two topological spaces, and let f : 
X — Y be a function. We call f continuous (with respect to 7 and 7’) if for every 
open set U CY, the set f~1(U) is open in X. 


Example A.2.27. Proposition A.1.29 shows that any function called continuous 
between two metric spaces (X,D) and (Y, D’) is continuous with respect to the 
topologies induced by the metrics on X and Y. 


In previous contexts (e.g., functions from R” to R™, and functions between 
metric spaces), we first defined the notion of continuity at a point and then expanded 
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to continuous over a domain. In this case, as in all texts on topology, we first defined 
continuity over a whole domain. However, there is a natural definition for continuity 
at a point. 


Definition A.2.28. Let f : X — Y be as in the previous definition. We call f 
continuous at « € X, if for every neighborhood V of f(x), there is a neighborhood 
U of a with f(U) CV. 


The following proposition shows why this definition of continuity at a point is 
natural. 


Proposition A.2.29. Let (X,7) and (Y,7’) be topological spaces, and let f :X > 
Y be a function. Then f is continuous if and only if f is continuous at alla EX. 


Proof. First, suppose that f is continuous. Then for all f(a) € Y and for all open 
neighborhoods V of f(x), f~'(V) is open in X. Furthermore, x € f~!(V) so 
f-1(V) is an open neighborhood of x. Also, since f(f~'(V)) = V for any set, we 
see that setting U = f~'(V) proves one direction. 

Second, assume that f is continuous at all x € X. Let V be any open set in Y. If 
V contains no image f(x), then f~'(V) = 0, which is open in X. Therefore, assume 
that V contains some image f(x). Let W = f~1(V). According to the assumption, 
for all 2 € W there exists open neighborhoods U, of x such that f(U,) C V. Since 
f(Ux) C V, then U, c f~!(V) = W. But then 


Wc ||,) Use w®, 
crew 


and since W° Cc W always, we conclude that W = W°, which implies (see Problem 
A.2.7) that W is open. This shows that f is continuous. 


Proposition A.2.30. Let (X,7), (Y,7’), and (Z,7") be three topological spaces. If 
f:X—->Y andg:Y — Z are continuous functions, then go f : X — Z is also a 
continuous function. 


Proof. (Left as an exercise for the reader.) 


For metric spaces, a continuous function is one that preserves “nearness” of 
points. Though topological spaces are not necessarily metric spaces, a continuous 
function f : X — Y between topological spaces preserves nearness in the sense that 
if two images f(x,) and f(#2) are in the same open set V, then there is an open 
set U that contains both x; and x2, with f(U) CV. 

In set theory, we view two sets as “the same” if there exists a bijection between 
them: they are identical except for how we label the specific elements. For topo- 
logical spaces to be considered “the same,” not only do the underlying sets need to 
be in bijection, but this bijection must preserve the topology. This is the concept 
of a homeomorphism. 
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Definition A.2.31. Let (X,7) and (Y,r’) be two topological spaces, and let f : 
X — Y bea function. The function f is called a homeomorphism if 

1. f is a bijection; 

2. f : X — Y is continuous; 

3. f-1:Y > X is continuous. 


If there exists a homeomorphism between two topological spaces, we call them 
homeomorphic. 


Example A.2.32. Any two squares $; and Sz, as subsets of R? with the Euclidean 
metric, are homeomorphic. For i = 1,2, let t; be the translation that brings the 
center of 5; to the origin, R; a rotation that makes the edges of ¢;(.5;) parallel to the 
x- and y-axes, and let h; be a scaling (homothetie) that changes R; 0t;($;) into the 
square with vertices {(—1, —1), (—1, 1), (1,1), (1, —1)}. It is easy to see that transla- 
tions are homeomorphisms of R? to itself. Furthermore, by Problem A.1.28, we see 
that R; and h; are continuous, and since they are invertible linear transformations, 
their inverses are continuous as well. Thus, R; and h; are homeomorphisms. Thus, 
the function 
f=iz eR chy oho hog 


is a homeomorphism of R? into itself that sends S; to Sz. Thus, S; and S» are 
homeomorphic. 


Figure A.6: A homeomorphism between a circle and a square from Example A.2.33. 


Example A.2.33. Any circle and any square, as subsets of R*, are homeomorphic. 
We must present a homeomorphism between a circle and a square. By Example 
A.2.32 we see that any two squares are homeomorphic. By a similar argument, any 
two circles are homeomorphic. So without loss of generality, consider the consider 
the unit circle S and the unit square T, both with center (0,0). (See Figure A.6.) 
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Let f : T > S be the function defined by f(2) = x/||x||, where x is viewed 
as a point in R?. It is clear that this is a continuous function on its domain, 
IR? — {(0,0)} so it is continuous on the subset T. For 6 € R, consider the function 
g(@) = max(|sin 6|,|cos@|) and then the curve y : [0,27] + R? parametrized by 
y(t) = (cos(t)/g(t), sin(t)/g(t)). It is not hard to verify that this curve traces out 
the unit square 7’. Identifying a point on the circle with its angle, we can think of 
y as a function S — T. It is not hard to see that g is continuous and that 7¥ is as 
well. Furthermore, (f 0 y)(p) = p for all p € S and also (yo f)(q) =¢q for all g € T. 
Hence, f is a homeomorphism between the unit square and the unit circle. 

Figure A.6 shows corresponding points P and Q as well as corresponding neigh- 
borhoods of these points. 


Example A.2.34. The above two examples only begin to illustrate how different 
homeomorphic spaces may look. In this example, we prove that any closed, simple, 
parametrized curve y in R” is homeomorphic to the unit circle S! (in R?). Let 
& : [0,1] > R” be a parametrization by arclength for the curve y such that Z(t,) = 
#(tz) implies that t; = 0 and tz = 1, assuming t, < tz. The function f : y > S! 
defined by 


f(P) = (cos (Fe-(P)), sin (F2(P))) (A.2) 


produces the appropriate homeomorphism. Note that #~! is not well defined only at 
the point Z(0) because #~1(%(0)) = {0,1}. However, using either 0 or / in Equation 
(A.2) is irrelevant. 


Example A.2.35. In contrast to the previous example, consider the closed, regular 
curve 7 parametrized by Z : [0,27] + R? with 


X(t) = (cost, sin 2t). 
The curve y traces out a figure eight of sorts and is not simple because £(7/2) = 


(0,0) = #(37/2) (and #’(1/2) 4 £'(37/2)). We show that it is not homeomorphic 
to a circle St. Call P = (0,0). 


Q 


Figure A.7: Figure eight not homeomorphic to a circle. 


There does exist a surjective continuous map f of y onto the circle S', namely, 
using the parametrization Z(t), 


f(Z(t)) = (cos 2t, sin 2¢) 
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which amounts to folding the figure eight back onto itself so that the circle is covered 
twice. However, there exists no continuous bijection g : y > S!. To see this, call 
Q = g(P), and let U be a small open neighborhood of Q. If g is a bijection, it 
has exactly one preimage for every element 2 € U. Thus, g~'(U) is the image 
of a nonintersecting parametrized curve. However, every open neighborhood of P 
includes two segments of curves. Thus, the circle and y are not homeomorphic. 


Figure A.8: Figure eight not homeomorphic to a segment. 


Example A.2.36. Consider the same regular, closed parametrized curve as in the 
previous example. The function f : (0,1) > 7. defined by f(t) = E(Qnt - 9) isa 
bijection. Furthermore, f is continuous since it is continuous as a function (0,1) > 
R?. However, the inverse function is not continuous and in fact no continuous 
bijection g : (0,1) > y2 can have a continuous inverse. 

Let a € (0,1) be the real number such that f(a) = P, the point of self- 
intersection on 72. Take an open segment U around a. The image f(U) is a portion 
of y2 through P (see the heavy lines in Figure A.8). However, this portion f(U) 
is not an open subset of y2 since every open neighborhood of P includes f((0,¢1)) 
and f((1—€2,1)). Thus there exists no homeomorphism between the open interval 
and the figure eight. 


We conclude this section on continuity by mentioning one particular result, the 
proof of which exceeds the scope of this appendix. 


Theorem A.2.37. The Euclidean spaces R” and R™ are homeomorphic if and 
only if m=n. 


This theorem states that Euclidean spaces can only be homeomorphic if they 
are of the same dimension. This might seem obvious to the casual reader but this 
fact hides a number of subtleties. First of all, the notion of dimension of a set 
in topology is not at all a simple one. Secondly, we must be careful to consider 
space-filling curves, such as the Peano curve, which is a continuous surjection of 
the closed interval [0,1] onto the closed unit square [0,1] x [0,1]. (See [28], Section 
3-3, for a construction.) The construction for space-filling curves can be generalized 
to find continuous surjections of R” onto R’™ even if n < m. However, Theorem 
A.2.37 implies that no space-filling curve is bijective and has a continuous inverse. 
That Euclidean spaces of different dimensions are not homeomorphic means that 
they are distinct from the perspective of topology. 
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A.2.3 Derived Topological Spaces 


Given any topological space (X,7), there are a number of ways to create a new 
topological space. We present two common ways — subset topology and quotient 
spaces — which are used throughout this book. 


Definition A.2.38. Let (X,7) be a topological space, and let S be any subset of 
X. We define the subset topology r' on S by calling a subset A C S open if and 
only if there exists an open subset U of X such that A= SOU. 


The subset topology is sometimes called the topology induced on S from X. 
There is an alternate way to characterize it. 


Proposition A.2.39. Let (X,7) be a topological space, and let S Cc X. Leti: S > 
X be the inclusion function. The subset topology on S is the coarsest topology such 
that i is a continuous function. 


Proof. Given any subset A of X, we have i-!(A) = ANS. If i is continuous, then 
for all open subsets U Cc X, the set it'(U) = UNS is open. However, according to 
Definition A.2.38, the subset topology on S' has no other open subsets and therefore 
is coarser than any other topology on S$, making 7 continuous. 


Example A.2.40. Consider R equipped with the usual topology. Let S = [a,b 
be a closed interval. If a < c < b, in the subset topology on S, the interval [a,c) is 
open. To see this, take any real d < a. Then 


la,c) = (d,c) N [a,b], 


and (d,c) is open in R. 


A second and often rather useful way to create new topological spaces is to 
induce a topology on a quotient set. 


Definition A.2.41. Let X be a set, and let R be an equivalence relation on X. 
The set of equivalence classes of R is denoted by X/R and is called the quotient set 
of X by R. 


The concept of a quotient set arises in many areas of mathematics (congruence 
classes in number theory, quotient groups in group theory, quotient rings in ring 
theory, etc.) but also serves as a convenient way to define interesting objects in 
topology and geometry. We provide a few such examples before discussing topologies 
on quotient sets. 


Example A.2.42 (Real Projective Space). The typical construction of the real 
projective space is given in Example 3.1.6. We present two other equivalent con- 
structions. 

Let X be the set of all lines in R"+', and let R be the equivalence relation of 
parallelism on X. We can therefore discuss the set of equivalence classes X/R. Each 
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line in X is uniquely parallel to a line that passes through the origin, and hence, 
X/R may be equated with the set of lines passing through the origin. This set is 
called the real projective space of dimension n and is usually denoted by RP”. 

As an alternate characterization for RP”, consider the unit n-sphere S” in R"+! 
and centered at the origin. Each line through the origin intersects S” at two antipo- 
dal points. Therefore, RP” is the quotient set of S” with respect to the equivalence 
relation, in which two points are called equivalent if they are antipodal (form the 
ends of a diameter through S”). 


Example A.2.43 (Grassmannian). Let X, be the set of all r dimensional vec- 
tor subspaces in R”, and let R be the equivalence relation of parallelism between 
hyperplanes. The set of equivalence classes is called a Grassmannian and is de- 
noted G(r,n). Again, since each r-dimensional hyperplane is uniquely parallel to 
one hyperplane through the origin, G(r,n) is the set of r-hyperplanes through the 
origin. 

Of particular interest to geometry is the question of how to give a topology to 
X/R if X is equipped with a topology. Proposition A.2.39 illustrates how to make 
a reasonable definition. 


Definition A.2.44. Let (X,7) be a topological space and let R be an equivalence 
relation on X. Define f : X + X/Ras the function that sends an element in X to its 
equivalence class; f is called the quotient map (or sometimes identification map). 
We call quotient topology (or identification topology) on X/R the finest topology 
that makes f continuous. 


The above definition for the quotient topology does not make it too clear what 
the open sets of X/R should be. The following proposition provides a different 
characterization. 


Proposition A.2.45. Let (X,7) be a topological space, let R be an equivalence 
relation on X and let f be the quotient map. The quotient topology on X/R is 
rt! = {U € P(X/R)| f7'(U) € Tr}. 


Proof. Let 7’ be a topology on X/R such that f : X > X/R is continuous. Then 
for all open sets U € 7’, we must have f~1(U) be open in X. Note first that 
f-1(0) =@ and f~!(X/R) = X, which are both open in X. 

From basic set theory, for any function F : A — B and any collection C of 
subsets of B, we have 


F-1( U s) =(J)F\(S) and F-1( al s) = () FS). 
SEC SEC SEC SEC 


(See Section 1.3 in [48].) Therefore, if VU; and U2 are such that f~'(U1)N f~1(U2) is 
open, then f~'(U; M U2) is open. Also, for any collection of sets {Uatacr in X/R, 


m0 a Ormacs 


ael ael 
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Figure A.9: Open set around {0, 1}. 


so if the right-hand side is open, then so is the left-hand side. Consequently, the 
collection of subsets B of X/R such that U € B if and only if f-'(U) € rT isa 
topology on X/R. However, any finer topology on X/R would include some subset 
S Cc X/R such that f~'(S) would not be open in X and, hence, would make f not 
continuous. 


Example A.2.46 (Circles). Let I be the interval [0,1] equipped with the topology 
induced from R. Consider the equivalence relation that identifies 0 with 1, and 
everything else inequivalent. The identification space [/R is homeomorphic to a 
circle. We may use the function f : [/R — S! defined by 


f(t) = (cos 2rt, sin 2zt), 


which is well-defined since f(0) = f(1), so whether we take 0 or 1 for the equivalence 
class {0,1}, we obtain the same image. This function f is clearly bijective. 

To prove that f is continuous, let P € S', and let x € I/R. If P ¥ (1,0), then 
any open neighborhood U of P contains an open interval of angles 6; < 6 < 6, 
where 0 < 6, < 82 < 27 and the angle 69 corresponding to P satisfies 0; < 09 < @2. 


Then 0, 8 
1 0% 
flan ae) 
contains P and is a subset of U. On the other hand, if P = (1,0), any open 


neighborhood U’ of P in S! contains an open arc of angles 0; < @ < 62, with 
0, <0 < 6). Then if g: I > 1/R is the quotient map, 


fea([oss) eC ae]) 


contains P and is contained in U’. Furthermore, by definition, g([0, 02/27) U (1 — 
6, /27,1]) is open in I/R since [0,62/27) U (1 — 61/27, 1] is open in the subset 
topology of [0,1]. By Proposition A.2.29, f is continuous. 

Using a similar argument, it is not hard to show that f—! is continuous, con- 
cluding that I/R is homeomorphic to S!. 


A.2. Topological Spaces 


393 


Figure A.10: Mobius strip. 


Example A.2.47. Consider the real projective space RP” given as the quotient 
space S”/ ~ where ~ is the equivalence relation on S” where p; ~ pg if and only 
if they are antipodal to each other, i.e., form a diameter of the sphere. The unit 
sphere S” naturally inherits the subspace topology from R"*+!. Definition A.2.44 
provides the induced topology for RP”. 


Example A.2.48 (Mobius Strip). Let J = [0,1] x [0,1] be the unit square with 
the topology induced from R”. Define the identification (equivalence relation ~) 
between points by (0,y) ~ (1,1 — y), for all y € [0,1] and no other points are 
equivalent to any others. The topological space obtained is called the Mobius strip. 
In R®, the Médbius strip can be viewed as a strip of paper twisted once and with 
ends glued together. Figure A.10 shows an embedding of the Mébius strip in R°, 
as well as a diagrammatic representation of the Mobius strip. In the diagrammatic 
representation, the arrows indicate that the opposite edges are identified but in 
inverse direction. The shaded area shows a disk around a point p on the identified 
edge. 


A.2.4 Compactness 


In any calculus course, we encounter the Extreme Value Theorem, a result in anal- 
ysis that forms an essential ingredient to Rolle’s Theorem, hence the Mean Value 
Theorem, and therefore the Fundamental Theorem of Calculus. 


Theorem A.2.49 (Extreme Value Theorem). Let f : [a,b] + R be a continuous 
real-valued function. Then f attains a maximum and a minimum over the interval 


[a, b]. 


As topology developed, a variety of attempts were made to generalize the idea 
contained in this fact. In the context of metric spaces, the key properties that 
would allow for a generalization are that the interval [a,b] is closed and bounded. 
Though closed is a concept in topology, the notion of being bounded does not 
have an equivalent notion. It turns out that in topological spaces, the notion of 
compactness provides this desired generalization. 
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Definition A.2.50. Let (X,7) be a topological space. Let U = {Ui}ier be a 
collection of open sets in X. We call YU an open cover of X if 


X=(JU.. 


ier 


If J C I, the collection V = {V;}j¢,7 is called a subcover of U if V is itself an open 
cover of X. 


Definition A.2.51. A topological space (X,7) is called compact if every open cover 
of X has a finite subcover. 


Remark A.2.52. A subset A of X is called compact if it is compact when equipped 
with the subspace topology induced from (X,7). This is equivalent to the property 
that whenever there exists a collection of U = {U;}ie7 of open sets with A C Uj,e, Vi, 
there exists a finite subset J C I with AC Uje, Uj. 


Some examples of compact spaces are obvious, such as that any finite subset of 
a topological space is compact. However, the following theorem justifies, at least in 
part, the given definition of compactness. 


Theorem A.2.53 (Heine-Borel). A closed and bounded interval [a,b] of R is com- 
pact. 


Proof. (A variety of proofs exist for the Heine-Borel Theorem. The following proof 
is called the “creeping along” proof.) 
Let U = {Ua}aer be an open cover of [a,b]. Define the subset of [a,b] by 


E = {x € [a, }] | [a, z] is contained in a finite subfamily of U/}. 


Obviously, F is an interval, but a priori we do not know whether it is open or closed 
or even nonempty. We will show that b € E to establish the theorem. 

Now a € £ because there exists some U,, such that a € Ug,. Let ¢ = lub E. 
Clearly a < c < b. Suppose that c < b. Since U covers [a,b], there exists some index 
6 such that c € Ug, and therefore there exists ¢ such that (c— e,c+e) C Ug. Since 
cis the least upper bound of EF, any x € (c—e,c) is in F. Therefore, there exists a 
finite set {a1,@2,...,@,} such that 


But then the finite union of open sets 


n 


(c-e,c+e)UL) Ua 


i=l 


contains the point c+ ¢/2, contradicting the assumption that c < b and c = lub E. 
Thus c= b, and be E. 
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Compactness is an important property of topological spaces, and we refer the 
reader to Chapter 3 in [2], Section 6.6 in [15], or Chapter 7 in [27] for complete 
treatments. For the purposes of this book however, we are primarily interested 
in two results, namely Theorem A.2.57 and Theorem A.2.58, and a few necessary 
propositions to establish them. We will not prove Theorem A.2.57 completely but 
again refer the reader to the above sources. 


Proposition A.2.54. Let (X,7) be a topological space, and let K be a compact 
subset of X. Every closed subset of K is compact. 


Proof. Let F Cc K be a closed set. Then X — F is open. If U is an open cover of F, 
then UU {X — F} is an open cover of K. Since K is compact, UU {X — F} must 
admit a finite subcover of K. This finite subcover of K is of the form U’U{X — F}, 
where UW’ is a finite subcover of U of F. Thus, F is compact. 


Problem A.1.19 established the fact that two distinct points p; and p2 in a metric 
space possess, respectively, open neighborhoods U, and U2 such that U, M U2 = 9. 
This type of property is called a separation property of a topological space because 
it gives some qualification for how much we can distinguish points in the topological 
space. There exists a variety of separation axioms, but we only present the one that 
is relevant for differential geometry. 


Definition A.2.55. A topological space (X,7) is called Hausdorff if given any two 
points p; and pz in X, there exist open neighborhoods U, of p; and U2 of pz such 
that U; NU2 = 9. 


Proposition A.2.56. If (X,7) is a Hausdorff topological space, then every compact 
subset K of X is closed. 


Proof. Let K be compact. Since X is Hausdorff, for every x € X —K andye k, 
there exist open sets U,, and V,,, with « € U,, andy € Vzy, such that Uzy Wy = 0. 
For each x € X — K, 

{Vey ly € K} 


is an open cover of K, so it must possess a finite subcover that we index with a 
finite number of points y1,Yyo,---,;Yn- But then for each z, 


Uy = e Ue, 
t=1 


is open since it is a finite intersection of open sets. Since K C Ui_, Viy,, we conclude 
that KOU, = for all x € X — K. Thus, X — K is a neighborhood of all of its 
points and hence it is open. Thus, K is closed. 


Theorem A.2.57. Let A be any subset of R” (equipped with the Euclidean topol- 
ogy). The set A is compact if and only if it is closed and bounded. 
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Proof. (We only prove =.) 

Suppose that A is compact. Since by Problem A.1.19 any metric space is Haus- 
dorff, Proposition A.2.56 allows us to conclude that A is closed. Since every open 
set in a metric space is the union of open balls, any open cover of A can be viewed as 
an open cover of open balls. If A is compact, it is contained in only a finite number 
of such open balls {B,,(p:)}%2,. There exists an open ball B,(p) that contains all 
the B,,(p;). The radius r will be less than (m—1) max{d(p;, p;)}+2max{r;}. Then 
K Cc B,(P), and hence, K is bounded. 

(The proof of the converse is more difficult and uses other techniques that we 
do not have the time to develop here.) 


Theorem A.2.57 establishes that we might view closed and bounded subsets of 
R” as the topological analog to [a,b] C R in Theorem A.2.49. We now complete 
the generalization to topological spaces. 


Theorem A.2.58. Let f : X — Y be a continuous function between topological 
spaces X and Y. If X is a compact space, then f(X) is compact in Y. 


Proof. Let U be an open cover of f(X). Since f is continuous, each f~!(U) is open 
and the collection {f~!(U)|U € U} is an open cover of X. Since X is compact, 
there exists a finite set {U1,U2,...,Un} CU such that 


Since for any functions f(f~!(A)) C A always and f(U, Ay) =U), f(Ay), then 


£20 =4(U Fo) = Urea) c Ue. 


i=l 


Thus, f(X) is compact. 


Corollary A.2.59. Let X be a compact topological space, and let f : X > R bea 
real-valued function from X. Then f attains both a mazimum and a minimum. 


Proof. By Theorem A.2.58, the image f(X) is compact. By Theorem A.2.57, f(X) 
is a closed and bounded subset of R. Hence, lub {f(x)|a © X} and glb {f(x)|a € 
X} are both elements of f(X) and hence are the maximum and minimum of f over 
Xx, 


Corollary A.2.60. Let (X,7) and (Y,r') be topological spaces, and let f : X > Y 
be a continuous function onto Y. If X is compact, then so is Y. 


Corollary A.2.61. Let X be a compact topological space. If a space Y is homeo- 
morphic to X, then Y is compact. 
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A.2.5 Connectedness 


We end this overview of point set topology with a discussion of connectedness. 
The concept of connectedness is rather natural. We simply mean that we cannot 
subdivide the topological space into two “parts, ” i.e., nonempty disjoint open 
subsets. 


Definition A.2.62. A nonempty topological space (X,7) is called connected if 
whenever X = U UV, where U and V are open and disjoint, then either U = @ or 
V=90. 


As we will see, when proving results concerning connectedness, it is often useful 
to assume the set is not connected and prove a contradiction. Consequently, we 
present a definition and terminology for the negation of connectedness. 


Definition A.2.63. Let (X,7) be a nonempty topological space. A separation of 
X is a pair (U,V) of nonempty open subsets such that UNV =@ andUUV =X. 
If (X,7) has a separation then we say it is disconnected. 


Proposition A.2.64. Let (X,7T) be a topological space. A subset Y if connected if 
and only if there does not exist a pair of open subsets U,V € T such that UNVNOY = 9 
andY CUUYV. 


Proof. The subspace Y is connected if and only if there exists a pair (U’, V’) of open 
subsets in (the subspace topology of) Y such that U'UV’ = Y and U’NV’ = 9. By 
definition of the subspace topology, U'’ = Y MU and V’=Y MV for sets U,V € r. 
Since 


Year ur Say nee n= nw wy), 


then Y CU UV. Similarly, since U’ NV’ = 0, then UN VOY =9. 


Proposition A.2.65. Let X be a topological space and let Y; and Y2 be two con- 
nected subsets such that Y;N Yo #4. Then Y, UY2 is a connected subspace. 


Proof. Assume the contrary, that Y,UY2 is disconnected. Let (U,V) be a separation 
of Y; UY2. Then either ¥; c U or Y, C V; otherwise (UN Y1,vN Y,) forms a 
separation of Y,, which is a contradiction since Y; is connected. Similarly, Yo Cc U 
or Yo CV. If Y, CU and Y2 C V orif Y; C V and Yo CU, then Y; N Yo = 0, a 
contradiction. If Y, C U and Y2 C U or if Y; C V and Yo C V, then either V = 0 
or U = 9, both contradictions. Hence, Y; U Y2 is not disconnected. 


The following proposition gives one of the key examples of connected topological 
spaces. 


Theorem A.2.66. Any interval I of R with the subspace topology is connected. 


398 


A. Point Set Topology 


Proof. Assume that there exists a separation (U,V) of I. Let a¢ U and be V, 
and without loss of generality suppose that a < b. By definition of an interval, we 
have [a, 6] C I. Calling A = UN [a,b] and B=V 71 [a,b], we see that [a,b] = AUB 
and that ANB =9. 

Let c = sup A. We show that c is in neither A nor B, which leads to a contra- 
diction since, by construction, c < b and c > a, and hence c € [{a, }]. 

Assume that c € A. Then c#b soc < b. Since A is open in [a,b], there exists 
an interval [c,c+¢), with ct+e < b, contained in A. Then c+¢/2 € A, contradicting 
the hypothesis that c = sup A. 

Assume that c € B. Then c # a, so either c = b or c € (a,b), the open interval. 
Since B is open, there is some interval (c—¢,c] in B. If c = b, then c—¢/2 is greater 
than any element in A, contradicting c = sup A. If c < b, then (c.b] C B and thus 
[c, b] C B. Since B is open in J, then (c—, b] C B so again c — ¢/2 is greater than 
any element in A, contradicting c = sup A. 

The theorem holds by contradiction. 


Definition A.2.67. Let (X,7) be a topological space. A connected open subset U 
is called a connected component of X if U = X or if (U,X —U) is a separation of 
Xx. 


Clearly, every topological space is a union of its connected components. 

There is another common way to formulate a precise definition for the intuitive 
notion of being connected, namely, deciding where it is possible to “get there from 
here.” 


Definition A.2.68. A topological space (X,7) is called path-connected if for any 
two points p,q € X, there exists a continuous map y : [0,1] > X such that 7(0) = p 
and (1) = gq. 


Proposition A.2.69. If a topological space is path-connected, then it is connected. 
Proof. Let (X,7) be path-connected and assume that X has a separation (U,V). 


Let p,q € X with p € U and q € V, and let y : [0,1] ~ X be a continuous curve 
connecting p and q. Then 


yay WV) =a (UNV) = 7") = 8. 
On the other hand, 
y UU *(V) =y(UUV) = 7 ")(X) = [0, 1). 
By Proposition A.2.66, [0,1] is connected so this gives a contradiction because 


y~*(U) and y~1(V) would form a separation of [0,1]. We deduce that X is con- 
nected. 
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Figure A.11: Connected but not path-connected. 


This proposition allows us to quickly conclude that R” and open balls B,(p) in 
R” are connected topological spaces. 

Though path-connected implies connected, the reverse is not true. This means 
that connected and path-connected are not equivalent concepts. The following 
example describes a subset of R? that is connected but not path-connected. 


Example A.2.70. Let X be the subset of R? that is the union of the unit circle 
and the image of the curve 7 : [1,00) + R? with 


st= ((1—2) costo, (1-2) cont) 


See Figure A.11. The subspace X is not path-connected since there is no path 
connecting a point p € y({1,0o)) to a point on the unit circle: any path from p 
staying on y([1,co)) moving toward the unit circle has infinite length, never reaching 
the unit circle. 

On the other hand, we claim that X is connected. The set y([1,0o)) is clearly 
path-connected and hence, by Proposition A.2.69, connected, as is the unit circle. 
Assume there is a separation (U,V) of X, then y([1,0o)) is in either U or V. 
Without loss of generality, suppose that y({1,00)) C U. Then the unit circle is in 
V. However, any open neighborhood of any point p on the unit circle intersects 
y({1,00)). Hence, UNV 4 @. This contradicts the assumption that X has a 
separation and we conclude that X is connected. 


PROBLEMS 
A.2.1. Prove Proposition A.2.13. 
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A.2.2. 


A.2.3. 


A.2.4,. 


A.2.5. 


A.2.6. 
A.2.7. 


A.2.8. 
A.2.9. 


A.2.10. 
A.2.11. 


A.2.12. 
A.2.13. 


A.2.14. 


Find all the topologies on the set {a,b,c}. How many different topologies exist 
on a set of four elements? 


Consider the topology 7 constructed in Example A.2.15. Prove that every open 
set in 7 is also open in the topology r’ induced from the Euclidean metric. Give 
an example of an open set in 7’ that is not T-open. (When these two facts hold, 
one says that r’ is a strictly finer topology than 7.) 


Let 7 be the set of all subsets of R that are unions of intervals of the form [a, b). 
Prove that 7 is a topology on R. Is 7 the same topology as that induced by the 
absolute value (Euclidean) metric? 


Prove that in a topological space (X,7), a set A is open if and only if it is equal 
to its interior. 


Prove Proposition A.2.9. 


Let (X,7) be a topological space, and let A and B be any subsets of X. Prove 
the following: 


(a) (4°)? = A’. 
(b) A° CA. 
(c) (AN B)® = ASN B®. 
(d) A subset U C X is open if and only U = U°. 

Find an example that shows that (AU B)° is not necessarily equal to A° U B®. 


Let (X,7) be a topological space, and let A and B be any subsets of X. Prove 
the following: 


(a) CL(CLA) = CLA. 

(b) AC CIA. 

(c) C(AUB) =ClAUCLB. 

(d) A subset F C X is closed if and only CIF = F. 


Find an example that shows that Cl(ANM B) is not necessarily equal to Cl ANCIB. 
Let (X,7) be a topological space, and let A be any subset of X. Prove the 
following: 

(a) CLA = A° UFYA. 

(b) ClLA—FrA= A?°. 

(c) FrA = Fr(X — A). 
Show that every open subset of R is the union of disjoint open intervals. 


Let (X,D) be any metric space, and let A be any subset of X. Consider the 
function f : X > R=° defined by D(a, A), where R=° is equipped with the usual 
topology. (Note that [0,a) is open in this topology on R2°.) Prove that f is 
continuous. (The set f~'([0,7)) is sometimes called the open r-envelope of A.) 

Let (X, D) be any metric space, and let A and B be closed subsets of X. Use the 


previous exercise to construct a continuous function g : X — R such that g(a) = 1 
for all a € A and g(b) = —1 for all bE B. 
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A.2.15. 


A.2.16. 


A.2.17. 


A.2.18. 


A.2.19. 


A.2.20. 
A.2.21. 
A.2.22. 


A.2.23. 


A.2.24, 


A.2.25. 


A.2.26. 


Let S? be the two-dimensional unit sphere in R*®. Give S” the topology of a metric 
induced on S? as a subset of R?. Suppose we locate points on S? using (6, ) in 
spherical coordinates. Let fa : S? + S? be the rotation function such that 


£(9,¢) = (0+, 4). 
Prove that f is continuous. 


Let X be a topological space, and let f : X — R be a continuous function. Prove 
that the set of zeroes of f, namely {a € X | f(x) = 0}, is closed. 


Prove Proposition A.2.30. 


Let (X,7) and (Y,7’) be topological spaces, and let f : X — Y be a function. 
Prove that F is continuous if and only if for all closed sets F C Y, the set f~'(F) 
is closed in X. 


Let R be the set of real numbers equipped with the absolute value topology. Prove 
the following: 


(a) Any open interval (a,b) is homeomorphic to the open interval (0, 1). 


) 
(b) Any infinite open interval (a, +00) is homeomorphic to (1, 00). 
(c) Any infinite open interval (a, +00) is homeomorphic to (—oo, a). 
(d) The open interval (0,1) is homeomorphic to the set of reals R. [Hint: Use 


f(x) = tan z.] 
(e) The interval (1,00) is homeomorphic to (0, 1). 
Conclude that all open intervals of R are homeomorphic. 
Prove that a circle and a line segment are not homeomorphic. 
Finish proving that the function f in Example A.2.34 is a homeomorphism. 


Consider Z and Q as subsets of R equipped with the absolute value metric. Decide 
whether Z and Q are homeomorphic. 


Let (X,7) and (Y,7’) be topological spaces, and suppose that there exists a con- 
tinuous surjective function f : X — Y. Define the equivalence relation on X 
by 

c~y <=> f(x) = fly). 
Prove that X/ ~ is homeomorphic to (Y,7’). 


Find a quotient space of R? homeomorphic to each of the following: (a) A straight 
line; (b) A sphere; (c) A (filled) rectangle; (d) A torus. 


Describe each of the following spaces: 
(a) A finite cylinder with each of its boundary circles identified to a point. 
(b) The sphere S* with an equator identified to a point. 
(c) R? with points identified according to (a, y) ~ (—x, —y). 


Find an open cover of the following sets that does not contain a finite subcover: 
(a) R; (b) (0,1); (c) (0,1). 
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A.2.27. 


A.2.28. 


A.2.29. 


A.2.30. 


A.2.31. 


A.2.32. 
A.2.33. 


A.2.34. 


A.2.35. 


A.2.36. 


Let K be a subset of a metric space (X, D). Prove that K is compact if and only 
if every sequence in K has an accumulation point in K. 


Let K be a compact subset of a metric space (X,D). Show that the diameter 
of K is equal to D(x,y) for some pair x,y € K. Prove that given any x € X, 
D(a, A) is equal to D(x, y) for some y € K. 


Let X be a compact topological space, and let {Kn}? be a sequence of nonempty 
closed subsets of X, with Kn+41 C Kn for all n. Prove that (Ve K,, is nonempty. 


Prove that the union of finitely many compact spaces is compact. Is the intersec- 
tion of two compact sets necessarily compact? 


Prove that the set R equipped with the finite complement topology (see Example 
A.2.14) is not Hausdorff. 


Prove Corollary A.2.60. 


Let f : X — Y be a continuous map between topological spaces. Show that if X 
is connected, then f(X) is connected in Y. 


Decide with proof if A = {(x,y) € R?|# > 0 and (y = 0 or y = 1/z)}, with the 
subset topology from R?, is connected or disconnected. 


Show that the Cartesian product of two connected topological spaces is again 
connected. 


Show that as subsets of R?, the union of two open balls X = By(—1,0)U B,(1,0) 
is disconnected but that Y = X U {(0,0)} is connected. 


APPENDIX B 


Calculus of Variations 


B.1. Formulation of Several Problems 


One of the greatest uses of calculus is the principle that extrema of a continuous 
function occur at critical points, i.e., at real values of the function, where the first 
derivative (partial derivatives when dealing with a multivariable function) is (are 
all) 0 or not defined. In practical applications, when we wish to optimize a certain 
quantity, we write down a function describing said quantity in terms of relevant 
independent variables, calculate the first partials, and solve the equations obtained 
by setting the derivatives equal to 0 or undefined. 

Many other problems in math and physics, however, involve quantities that do 
not just depend on independent variables but on an independent function. Some 
classic examples are problems that ask us to find the shortest distance between 
two points, the shape with fixed perimeter enclosing the most area, and the curve 
of quickest descent between two points. Calculus of variations refers to a general 
method to deal with such problems. 

Let [21,72] be a fixed interval of real numbers. For any differentiable function 
y : [%1,22] > R, the definite integral 


ty) = f° fle.wy)ae (B.1) 


is a well defined quantity that depends only on y(a) when the integrand f is a 
function of the arguments x, y, and y’. We can view the above integral I as a 
function from C1({x1,%2],R), the set of all continuously differentiable functions 
from [21,22], to R. The problem is to find all functions y(«) for which I(y) attains 
a minimum or maximum value for all y € C!({x1,22],R). Unlike optimization 
problems in usual multivariable calculus that involve solving algebraic equations, 
this initial problem in the calculus of variations involves a second-order differential 
equation for which the constants of integration are fixed once we set y(#1) = y; and 


y(@2) = ye. 
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Many generalizations to this first problem exist. For example, similar to opti- 
mization problems in multiple variables, we may impose certain conditions so that 
we consider only a subset of functions in C'({x1,x2],R) among those to optimize 
I(y). In another direction, we may seek to optimize the double integral 


rw) = ff #(enw. Ze) aA 


where D is a region of R? and w is a two-variable function. The solution would be 
a function w € C!(D,R) that produces the maximum or minimum value for the 
integral. Of course, we can consider situations where the unknown function w is a 
function of any number of variables. As a third type of generalization, we consider 


the integral 
2 dx dy 
I ’ = (¢, ae i) <) dt, 
con =f sem Goa G 


where I(x, y) involves two unknown functions of one independent variable t. 

Finally, we may then consider any number of combinations to the above gen- 
eralizations. For example, the isoperimetric problem — the problem of finding the 
shape with a fixed perimeter and maximum area — involves finding parametric 
equations z(t) and y(t) that produce a simple closed curve that maximizes area 
(a one-variable integration by Green’s Theorem), subject to the condition that the 
perimeter is some fixed constant. 

The following sections follow the excellent presentation given in [58]. 


B.2 Euler-Lagrange Equation 
B.2.1 Main Theorem 


Many problems in calculus of variations amount to solving a particular differential 
equation called the Euler-Lagrange equation and variants thereof. However, all the 
theorems that justify the use of the Euler-Lagrange equation hinge on one lemma 
and its subsequent generalizations. 


Lemma B.2.1. Let G be a continuous real-valued function on an interval [x1, x2]. 


If 
/ n(x)G(a«) dx = 0 (B.2) 


1 
for all continuously differentiable functions n(x) that satisfy n(a1) = n(x2) = 0, 
then G(x) = 0 for all x € [x1, x9]. 


Proof. We prove the contrapositive, namely, if G is not identically 0 then there 
exists some function 7(x) on [21,22] that does not satisfy Equation (B.2). If we 
assume that G is not identically 0, then there exists c € [11,72] such that G(c) 4 0. 
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By continuity, there exist a,b such that 7, <a<c<b< x2 and G(x) #4 0 for all 
x € [a,b]. Now consider the function 


0, for 7, <a <a, 
n(x) = 4 G(c)(a@ —a)*(a@— b)?, fora<a<b, 
0, forb<a< ao. 


The function (a) is continuously differentiable, and we have 


x2 b 
/ ACG \is= / Glc)G(a)(« — a)?(w — 6)? de 


The integrand on the right is nonnegative since G(x) has the same sign as G(c) and, 
by construction, equal to 0 only at x = a and x = b. Consequently, the integral on 
the right is positive. This proves the lemma. 


Let us consider the first problem in the calculus of variations, in which we wish 
to optimize the integral in Equation (B.1), with the only condition that y(x,) = y1 
and y(x2) = ya. The general tactic proceeds as follows. Assume y(a) is a function 
that optimizes I(y). Let (x) be an arbitrary continuously differentiable function 
on [21,%2], with 7(21) = n(a2) = 0. Define the one-parameter family of functions 
Y. by 

Y-(x) = y(x) + en(a). 


Obviously, for all ¢, we have Y.(%1) = y(#1) = yi and Y.(x2) = y(x2) = yo. For 
shorthand, we define 


Ie v= [4 fle, ¥2,¥2 d 


With this notation, we see that I(0) = I(y), and since y(z) is an optimizing function, 
then 
I’(0) =0 (B.3) 


no matter the choice of arbitrary function n(x). 
To calculate the derivative in Equation (B.3), we obtain 


Hex, fo? LOL Oe OF OY" -[" of of , 
Hem f ag de | ay! ae) ot = os (Sent goat!) de, 


where oe means explicitly Baez Y.(x), Y/(a)) and similarly for oe. Setting Equa- 
tion (B. 3) then becomes 
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Integrating the second term in this integral by parts, we obtain 


r= [Pei] + [Bon (na 


=| (3 - a(55) )nde =o. 


al 


Applying Lemma B.2.1 to the above equation proves the following theorem. 


Theorem B.2.2. Let y: [%1,22] > R be a function that optimizes 


x2 
ry) = f° fle.yy))ae. 
v1 
Then y satisfies the differential equation 
= 0, (B.4) 


which is called the Euler-Lagrange equation. 


Just as a solution xp to f’(a) = 0 is not necessarily a maximum or minimum, 
a function that satisfies this equation is not necessarily an optimizing function. 
Consequently, we call a solution to Equation (B.4) an extremizing function. Un- 
derstanding that ge means fy (x, y(x),y’(x)), we notice that the Euler-Lagrange 
equation is a second-order differential equation of y in terms of «x. 

Since Equation (B.4), and in particular the left-hand side of this equation, occurs 


frequently, we define it as the Lagrangian operator £ on a function f(x, y, y’), where 
y is a function of x, by 
_ of d ( Of ) 


co ani reer By! 


This £(f) is a differential operator on functions y in C?([x1, 22], IR) because for any 
given y(a) function, £(f)(y) is a continuous function over [21,72]. Note that L is 
a linear transformation in f. On the other hand, whether the differential equation 
L£(f) =0 is a linear operator in y(x) depends on f. 


B.2.2. Brachistochrone Problem 


At the turn of the 18th century, Johann Bernoulli posed the problem of finding the 
path in space that a particle will take when travelling under the action of gravity 
between two fixed points but taking the shortest amount of time. To be precise, the 
problem assumes no friction, a simple constant force of gravity mg (where m is the 
mass of the particle and g the gravity constant), and an initial velocity v; that is 
not necessarily 0. This problem became known as the “brachistochrone” problem, 
the roots of which come from the Greek words brachistos (shortest) and chronos 
(time). 


B.2. Euler-Lagrange Equation 
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We suppose the two fixed points A and B lie in a vertical plane that we can 
label as the xy-plane, with the y-axis directed vertically upward and the x-axis 
oriented so that passing from A to B means an increase in x. Let A = (x1, y1) 
and B = (#2, y2) so that any curve y(xz) connecting A and B satisfies y(a1) = y1 
and y(x2) = yg. Note that though the shape of a curve from A to B is a function 
y(az), a particle moving along this curve under the action of gravity travels with 
nonconstant speed. 

The speed along the curve is given by v = 4%, where the arclength function s(z) 
satisfies 


ds 2 
= \/14+ (y'(2))’. 
ae + (y'(z)) 
The total time T of descent along the path y(z) is given by the integral 
L=2X2 x2 x2 1 1\2 
T= | 1dt= [ “=| AT 
L=X1 ry Vv ry v 


Since there is no friction and since gravity is a conservative force, the sum of the 
kinetic energy and potential energy remains constant, namely, 


1 2 eer 
ain +mgy = si? +mgy. 


Solving for v we obtain 


v = \/ vy + 29y1 — 2gy = V/29V Yo — 9; 


where yo = y1 + (v7/2g) is the height from which the particle descended from rest 
to reach v, at height y,. The time of travel is 


1 v2 1 1\2 
= EEO ae. (B.5) 
V2g Ly V¥o—Y 


and finding the path with the shortest time of travel amounts to finding a function 
y(x) that minimizes this integral. 
Applying the Euler-Lagrange equation, we label the integrand in Equation (B.5) 


as 
1+ 1\2 
AEE (B.6) 
V¥0—Y 
Notice that this problem has one simplification from the general Euler-Lagrange 


equation: f does not depend explicitly on x. This fact allows us to make a useful 
simplification. The chain rule gives 


Gaff , OF _ OF 4 AF 
dx Ox Yay"? ay Yay 4% dy! 


f(e,yy') = 
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since f does not depend directly on x. However, 


iV 5y) =" ay +4 de Sy) 


a Oy! dx 
df d/ ,of /(Of dof d/ ,of 
da = as 5y) + (ay) = ae Way) 
dx dx\" Oy Oy dx\dy dx \” Oy 
where the second term in the middle expression is identically 0 due to the Euler- 
Lagrange equation. Integrating both sides with respect to 7 we obtain 


Of 
fe oe — 
a Oy’ ae 


so 


for some constant C’. Using the specific function in Equation (B.6), we obtain 


(y')? 1G? 
Viyo— y+ (y)?) Vo = 


Solving for y/ = oy we obtain 


dy. A/C emy) 


dx V¥o—Y 
which, upon taking the inverse and integrating with respect to y, becomes 
c= ae dy. (B.7) 


Using the substitution 
6 
Yo — ¥ = ay sin? = (B.8) 
the integral in Equation (B.7) becomes 


as Oet. 20 = des 
L= az | sin 5 10 = sa | cos Od? = >a (sin — 8) + zo, (B.9) 


where 2 is some constant of integration. Rewriting Equation (B.8), setting a = 
1/(2C?), and substituting t = —0, we obtain the equations 


x =% + a(t —sint), 
y = yo — a(1 — cost). 


Obviously, these equations do not give y as an explicit function of x but do show 
that the path with most rapid descent is in the shape of an upside-down cycloid. 


B.3. Several Dependent Variables 
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B.3 Several Dependent Variables 
B.3.1 Main Theorem 


A first generalization to the basic problem in the calculus of variations is to find n 
twice-differentiable functions x1(t),...,@(t) defined over the interval [t,,t2] that 
optimize the integral 


te 
=i 7 igses aS ee (B.10) 
ty 


We follow the same technique as in Section B.2. Label 21 (t),...,@n(t) as the actual 
optimizing functions and define corresponding one-parameter families of functions 
by 

X;(t) = ai(t) + €&(¢), 


where €;(¢) are any differentiable functions with 
&(t1) = (t2) =0 forl<i<n. 


With the one-parameter families X;, we form the integral 
tg 
=} P(X, XnyXty---, X18) det. 
ti 


Then /(0) = J, and since by assumption the functions z1,...,2, are the optimizing 
functions, we must also have I’(0) = 0. 
Taking the derivative of [(<) and using the chain rule, we have 


ee ia. 2! OF a of 
=f axe t 2 axe” + axrei tt + 5x7 $n dt, 


where by 0f/0X; we mean the partial derivative to f with respect to the variable 
that we evaluate to be the one parameter of functions X;. Regardless of the arbitrary 
functions €;, setting « = 0 replaces the family of functions X; with the function 2;. 
Using the same abuse of notation for Of /Ox;, we have 


oF of a=: 


Since this equation must hold for all choices of the functions x;, we can in particular 
set €; = 0 for all indices 7 A 7. Then we deduce that 


te 
| Taft py eidt=0 for all 7. 
t 


1 
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Integrating the second term in the above integral by parts and using the fact that 
&(t1) = €;(t2) = 0, we obtain 


ye (5 > aig) eat =0. 


Then using Lemma B.2.1, we deduce the following theorem. 


Theorem B.3.1. Consider the integral 


te 
i Pei Hoa Mie ae 
ti 


where f is a continuous function and each x;(t) is a twice-differentiable function 


defined over [t,t]. Then the functions 11, %2,...,% optimize the integral I if and 
only if 

Of d / Of 

De, ia) for alll <i<n. 


Here again, if f is a function as defined in the above theorem, we define the 
Lagrangian operator £; or £,, as 


Li(f) 


aL 438) 


B.4 Isoperimetric Problems and Lagrange Multipliers 
B.4.1 Main Theorem 


In this section, we approach a new class of problems in which we desire not only to 
optimize a certain integral but to do so considering only functions that satisfy an 
additional criterion besides the usual restriction of continuity. In all the problems 
we consider, the criteria consist of imposing a prescribed value on a certain integral 
related to our variable function. More precisely, we will wish to construct a function 
x(t) defined over an interval [t1,t2] that optimizes the integral 


F=f sex) dt, (B.11) 


subject to the condition that 
ta 
i g(a, 2',t)dt= J (B.12) 
ti 


for some fixed value of J. It is assumed that f and g are twice-differentiable func- 
tions in their variables. Such a problem is called an isoperimetric problem. 


B.4. Isoperimetric Problems and Lagrange Multipliers 


All 


Following the same approach as in Section B.2, we label x(t) as the actual op- 
timizing function to the integral in Equation (B.11), which we assume also satisfies 
Equation (B.12), and we introduce a two-parameter family of functions 


X(t) = w(t) + €1€1(t) + e2€e(t), 


where €)(t) and &9(t) are any differentiable functions that satisfy 


E1(t1) = €a(t1) = £1(t2) = €a(te) = 0. (B.13) 


The condition in Equation (B.13) guarantees that X(t) = x(t1) =a, and X(t2) = 
x(t2) = 2 for all choices of the parameters €, and €2. We use the family of functions 
X(t) as a comparison to the optimizing function x(t), but in contrast to Section 
B.2, we need a two-parameter family, as we shall see shortly. 

We replace the function x(t) with the family X(t) in Equations (B.11) and (B.12) 
to obtain 


i) 
T(€1,€2) =H (XX 6) dt 

ty 

and : 
J(€1,€2) =| g(X, X", t) dt. 

ti 


The parameters €; and €2 cannot be independent if the family X(t) is to always 
satisfy Equation (B.12). Indeed, since J is constant, €, and €2 satisfy the equation 


J(€1,€2) =J (a constant). (B.14) 


Since x(t) is assumed to be the optimizing function, then I(€1, €2) is optimized with 
respect to €; and €9, subject to Equation (B.14) when ¢; = eg = 0, no matter the 
particular choice of €)(¢) and &2(t). 

Consequently, we can apply the method of Lagrange multipliers, usually pre- 
sented in a multivariable calculus course. Following that method, I(€1,€2) is opti- 
mized, subject to Equation (B.14), when 


al ad 
= fori=1,2 

eae CSE (B.15) 

J(E1,€2) = J, 


where A is a free parameter called the Lagrange multiplier. In order to apply this 
to Euler-Lagrange methods of optimizing integrals, define the function 


f (x, ae t) = f(a, e t) =. Ag(x, ie t). 
Then the first two equations in Equation (B.15) are tantamount to solving 
O * 
f _ 9 
OE; 


Following a nearly identical approach as in Section B.2, the details of which we 
leave to the interested reader, we can prove the following theorem. 
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Theorem B.4.1. Assume that f and g are twice-differentiable functions R? > R. 
Let x : [t1,t2] > R be a function that optimizes 


te 
i i fla, a!,t) dt, 
ti 


subject to the condition that 
ta 
j= g(x, x’, t) dt 
ti 


remains constant. Then x satisfies the differential equation 


cae a9 


where f* = f — Ag. Furthermore, the solution to Equation (B.16) produces an 
expression for x(t) that depends on two constants of integration and the parameter 
» and, if a solution to this isoperimetric problem exists, then these quantities are 
fixed by requiring that x(t1) = 71, x(t2) = x2, and J be a constant. 


Many generalizations extend this theorem, but rather than presenting in great 
detail the variants thereof, we present an example that shows why we refer to the 
class of problems presented in this section as isoperimetric problems. 


B.4.2. Problem of Maximum Enclosed Area 


Though simple to phrase and yet surprisingly difficult to solve is the classic question, 
“What closed simple curve of fixed length encloses the most area?” Even Greek 
geometers “knew” that if we fix the length of a closed curve, the circle has the 
largest area, but no rigorous proof is possible without the techniques of calculus of 
variations. 

To solve this problem, consider parametric curves ¢ = (a(t), y(t)) with t € 
[t1, t2]|. We assume the curve is closed so that #(t1) = £(t2) and similarly for all 
derivatives of #. The arclength formula for this curve is 


ta 
s= | VaR wPa, 
ti 
and by a corollary to Green’s Theorem, the area of the enclosed region is 


t2 
a= | xy! dt. 
ti 


Therefore, we wish to optimize the integral A, subject to the constraint that the 
integral S is fixed, say S = p. 


B.4. Isoperimetric Problems and Lagrange Multipliers 
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Following Theorem B.4.1 but adapting it to the situation of more than one 
dependent variable, we define the function 


fee yt) = ay mae? by?, 


and conclude that the curve with the greatest area satisfies 


Lu(f*), 
fae (B.17) 


Taking appropriate derivatives, Equation (B.17) becomes 


x 
yt+r ld + y C1 
xa—xX y _ Co. 
gl2 ne yl? 
From this, we deduce the relation 
(x — C2)? + (y— C1)? =’, (B.18) 


which means that the curve with a given perimeter and with maximum area lies 
on a circle. Since the curve is closed and simple, the parametric curve Z(t) is an 
injective function (except for Z(t) = #(t2)), and the image is in fact a circle, though 
there is no assumption that # travels around the circle at a uniform rate. That the 
Lagrange multiplier appears in Equation (B.18) is not an issue because, since we 
know that the perimeter is fixed at p, we know that A = p/2r. 


Taylor & Francis 
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APPENDIX C 


Further Topics in Multilinear Algebra 


Chapter 4 offers only a brief introduction to multilinear algebra. This appendix 
supplies a few more short topics with close connections to k-volume formulas in R”. 


C.1 Binet-Cauchy and k-Volume of Parallelepipeds 


The article [30] develops the connection between the wedge product of vectors in 
R” and analytic geometry. Most important for applications to differential geometry 
is a formula for the volume of a k-dimensional parallelepiped in R”. The authors 
of [30] give the following definition. 


Definition C.1.1. The dot product of two pure antisymmetric tensors in Ke R” is 


Gal, Grabs soe Beuby 
7 . oo 7 fist Gow by «3s ya dg 
(Gy A dg A+++ A Gx) + (br A bg A+++ A bg) = 

Guo by urbe +++ Bye dy 


It turns out that this definition is equivalent to the usual dot product on Ni R” 
with respect to its standard basis, namely, 
{Gi, A Gin Avs A Ej, }, with 1 <% <ig<-+++- <a <n. 
The equivalence of these two definitions is a result of the following combinatorial 
proposition. 


Proposition C.1.2 (Binet-Cauchy). Let A and B be two n x m matrices, with 
m <n. Call I(m,n) the set of subsets of {1,2,...,n} of size m and for any 
Se€TZ(m,n) ,denote Ag as them xm submatrix consisting of the rows of A indexed 
by S (and similarly for B). Then 


det(B'A)= 5°  (det(Bs)) (det(As)). 


SELI(m,n) 
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Proof. Let A = (a;;) and B = (bj), with 1<i<nand1<j<m. The matrix 
B' Ais an m x m-matrix with entries 


S bjidjn, 
j=l 
indexed by 1 < i,k <_m. Therefore, the determinant of B’ A is 


det(B' A) = 3 sign(o)( > bj,1@j,0()) ( y bj.244j.0(2)) Pts ( 3 Binam€motm)) 


c7ESm ji=l. jo=1 (ah 


where S,,, is the set of permutations on the set {1,2,...,m}. Then, after rearranging 
the order of summation, we have 


det(BTA) = S7 SoS 2 --- SO sign(o)dj,1j.2 ++ Bj nm @jyo(1)@jo0(2) “+ ®jmo(m) 


FESm Ji=lj2=1  jm=1 
n n n 
=y ye Dd bj,1Dj,2°* Bimm( SS sign(7)4j,0(1)44j20(2) "**@imotm)) 
ji=ljo=l 9 jm=l cESm 


Because of the sign of the permutation, any term in the summation where not all 
the 7; are distinct is equal to 0. Therefore, we only need to consider the summation 
over sets of indices j = (j1,j2,---,jm) € {1,...,m}", where all of the indices are 
distinct. We can parametrize this set in an alternative manner as follows. Let 
I(m,n) be the set of indices in increasing order, i.e., 


L(m,n) = {Sry Jay-+ey5m) © (Ay. MPP LS fa <a << jm Sn}. (C1) 


The set Z(m,n) x Si, is in bijection with the set of all m-tuples of indices that are 
distinct via 


(j, c) ad eas ae at ye 


We can now write 


det(B! A) 
= D> Ly 85, (1) 195,22 i Dy earl Ey sign(7) 45, )0(1) 44,2) 0(2) ad apacsatey) 
jEZ(m,n) TESm cESm 
= » OS B5.¢4)1 05 ,¢2)2°** Big ¢mym sign(r)( > sign(o" a5, 6/(1) %jn0!(2)°" jno!(m)) 
jEZ(m,n) TESm o'ESm 


= 2 S sign( )b;,(1)105,,2)2 os Bj Gmym det A;, 


jEZ(m,n) TESm 
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where Aj; is the m x m submatrix obtained from A by using only the rows given in 
the m-tuple index j. Then we conclude that 


det(B' A) = S- S- sign(T)b;,7-1(1)bj97-1(2) O57 1m) det A; 
jEZ(m,n) TESm 


= S- (det B;') (det Aj), 


jEeZ(m,n) 


and the proposition follows since det(C'') = det(C) for any square matrix C. 


Corollary C.1.3. Definition C.1.1 is equivalent to the dot product on Xe R” with 
respect to the standard basis. 


Proof. If @,,@2,...,@, is a k-tuple of vectors in R”, call A the n x k-matrix that 
has the vector @; as the ith column. Define P(n,k) as in Proposition C.1.2. For 
any subset S of {1,2,...,n} of cardinality k, define 


> 


- = *, 
S = Cs, \€s, A+++ A €s,, 


where S = {81,82,..., 8}, with the elements listed in increasing order. It is not 
hard to check that 


di, Ndg\---N\dx = > (det(As)) és. (C.2) 
SEP(n,k) 


The corollary follows immediately from Proposition C.1.2. 


As with a usual Euclidean vector space R”, we define the Euclidean norm in the 
following way. 


Definition C.1.4. Let a = @ A a A-+-A a € A\*R”. The (Euclidean) norm of 
this vector is 


ld. A dg A+++ A Gx|| = Va-a. 


Corollary C.1.5. The k-dimensional volume of a parallelepiped in R” spanned by 
k vectors v1, V2,...,0~ is given by 


oA do A+++ AG). 


Proof. It is a standard fact in linear algebra (see [14, Fact 6.3.7]) that the k-volume 
of the described parallelepiped is ,/det(A'A), where A is the matrix that has 
the vector v; as the ith column. The corollary follows from Definitions C.1.1 and 
C.1.4. 
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PROBLEMS 
C.1.1. Use the results of this section to calculate the surface of the parallelogram in R? 
spanned by 
1 4 
v= |-3 and w=] 5 
7 —2 


C.1.2. Calculate the 3-volume of the parallelepiped in R* spanned by 


= 

ro) 
| 

ww 


C.1.3. Using the same vectors G, b, and @ in the previous exercise, determine all vectors 
£ such that the four-dimensional parallelepiped spanned by 4, b, ¢, and Z has 
dimension 0. 


C.1.4. Verify the claim in Equation (C.2). 


C.1.5. A Higher Pythagorean Theorem. Let G, b, and ¢ be three vectors in R” that are 
mutually perpendicular. 


(a) Prove that 
|JZANBLAEAE+HAAM? = |Z Dll? + ||ZA AI? + IBA a’. 


(b) Consider the tetrahedron spanned by G, b, and é. Let Sc be the face spanned 
by @ and b, Sp be the face spanned by @ and @, S'4 be the face spanned by 
b and @, and let Sp be the fourth face of the tetrahedron. Deduce that 


Si +5% +82 = 83. 


C.2. Volume Form Revisited 


In Example 4.6.24, we introduced the volume form on R” in reference to the stan- 
dard basis. This is not quite satisfactory for our applications because the standard 
basis has internal properties, namely that it is orthonormal with respect to the dot 
product. The following proposition presents the volume form on a vector space in 
its most general context. 


Proposition C.2.1. Let V be an n-dimensional vector space with an inner product 
(,). Then there exists a unique form w € \"V* such that w(é1,...,én) = 1 
for all oriented bases (€1,...,€n) of V that are orthonormal with respect to (, ). 
Furthermore, if (t1,...,tn) is any oriented basis of V, then 


w=vVdetAdta---Ad, 


where A is the matrix with entries Aj; = (ui, aj)). 


C.3. Hodge Star Operator 
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Proof. Let (i#1,...,t%n) be any basis of V and let # = SO, a’ti; and w= D5, b’d; be 
two vectors in V along with their coordinates with respect to (w1,...,u,). Then 
by the linearity of the form, 


We remark that det A 4 0 because otherwise there would exist some nonzero vector 
v such that Av = 0 and then (v, ¢) = 0, which would contradict the positive-definite 
property of the form. 

The existence of an orthonormal basis with respect to (,) follows from the 
Gram-Schmidt orthonormalization process. If (€),...,€,) is an orthonormal basis 
with respect to (, ), then the associated matrix ((é,€;)) is the identity matrix. 

Given an orthonormal ordered basis € = (€1,...,€n), let (€*',...,€*") be the 
cobasis of V*. Set w = &1A---A &". Obviously, w(éi,...,é,) = 1. Now, if 
B = (t,...,t%,) is any other orthonormal basis of V with the same orientation of 
(€,,...,€,), then det(M'M) = 1, where M is the transition matrix from coordi- 
nates in (€1,...,€,) to coordinates in (w,...,@,). Hence, det(M)? = 1, and the 
assumption that (t#,,...,i%,) has the same orientation as (é1,...,é@,) means that 
det(M) is positive. Thus, det M = 1. 

By Proposition 4.1.6, the transition matrix from coordinates in (é*!,...,é*”) 
to coordinates in (w*,...,a*") is M~+. However, by Proposition 4.6.23, we then 
conclude that 


we etA--Ag™ =det(M) ata. Ag =a A. AG. 


Since &*! A --- A &*"(e1,...,€n) = 1, then w evaluates to 1 on all bases of V that 
are orthonormal and have the same orientation as {é1,..., @n}. 
Suppose now that {i/1,...,U%,} is any basis of V, not necessarily orthonormal. 


Again let M be the coordinate change matrix as above. By definition [tj]e = Mé;. 
Then we can calculate the coefficients of A by 


Aig = (ai, tj) = [di], - [aj], = (M&)' (Me) = a] M' Me). 


Hence, we have shown that A = M'M. We conclude that 


w=det(M)@'A---A@" = Vdet Ad nr... Aa”. 


Definition C.2.2. Let (V,(-,-)) be an inner product space of dimension n. Then 
the element w € /\" V* defined in Proposition C.2.1 is called the volume form of V. 


C.3 Hodge Star Operator 


We conclude this section by introducing an operator on wedge product spaces Ae V. 
In this subsection, we assume throughout that (V, (-,-)) is a finite dimensional inner 
product space. 
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Exercise 4.2.4 introduced the bijection ~ : V > V* defined by w(v) = Ay, where 
Av(w) = (v,w) for all w € V. Exercise 4.2.5 gave steps to extend the inner product 
(-,-) to an inner product (-,-)* on V* by defining 


(Xv; Aw)” = (w,v) = (v,w), 
since (-,-) is symmetric. In other words, for all 7,7 € V%*, 
(n, 7)" = (W* (9), b-*(7)). (C.3) 


From now on in this section, we drop the superscript * on the inner product extended 
to the dual. 


Proposition C.3.1. Let ,...,7,71,---,Tr © V*. Setting 
(My Avec Ages TAs oA Thy = det((y,, 7;)) 


defines an inner product on AP V*. 


Proof. (Left as an exercise for the reader. See Problem C.3.3.) 


Definition C.3.2. Let (V,(-,-)) be an inner product space of dimension n, and 
let w € A" V* be the volume form. The Hodge star operator is the operator 
+: A’ v* > A\"-* V* that is uniquely determined by 


(xn, TW = NAT 
for all r € A" * V*. 


The Hodge star operator has the following nice properties, which we leave as 
exercises. 


Proposition C.3.3. Let (V,(-,-)) be an inner product space. Let B = {e1,...,en} 
be a basis that is orthonormal with respect to (-,-), and let B cobasis of V*. Set w 
as the volume form with respect to (-,-). 


1. The Hodge star operator x is well defined and linear. 


2. Viewing 1 as an element of R= ‘A? V, then xl =w. 
3. For any k <n, then x(e*1 A--- A e**) = eX FAD A. Ae, 
4. For any k-tuple (i1,...,%%) of increasing indices, 
x(e* A+. A e**) = (signa) e*J2 A+++ A eink, 
where the j, indices are such that {i1,...,%n,j1,---)Jn—k} = {1,...,n} ando 


is the permutation that maps the n-tuple (i1,...,%k,J1,-+-;jn—k) to(1,2,...,7). 


C.3. Hodge Star Operator 
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Example C.3.4. This Proposition allows us to easily calculate the Hodge star 
operator of any (0, k)-tensor over V. For example, suppose that V = R* is equipped 
with the usual dot product and that (e1, e2, €3, e4) is the standard basis. Then using 
the above Proposition, we calcualte that 


x(e*? + Qe*? + Be*4) = e*? A eA c*4 — De"! A c*? A ek4 — Be"! Act A e*. 


The following proposition gives a formula for the coordinates of the «7 in terms 
of the coordinates of 7. 


Proposition C.3.5. Let (V,(-,-)) be an inner product space. Let B = {u1,...,Un} 
be any basis of V, and denote by {u*!,...,u*"} its cobasis in V*. Let A be the 
matriz with entries ai; = ((ui,u;)), and label a) as the components of the inverse 
Aol. Ifne AS V*, with coordinates 7j,...1,, 0 that 


*D *1 
= ) ited NS Ne 
1<i1<---<ipsn 


then the components of xn with respect to B* are 


vdet A Ahi 


eos Se Aas PA iphs 
CN Feces _ Cty te jiin—K& 
k! 


Maybe (C.4) 


nrg A 


where €n,.--h,, 18 the permutation symbol defined in (4.85). 


n 


Proof. By a calculation similar to the one in the proof of Proposition C.2.1 and 
using the definition of the inner product on 1-forms given in Equation (C.3), we 
determine that 
(u**, u*I) oa a’) , 

ice., the (i, 7)th entry of the the inverse A7?. 

As above, denote by w the volume form on V associated to (-,-). 

A few preliminary notations will render the rest of the proof shorter. Recall the 
set Z(m,n) defined in Equation (C.1). For any sequence i = (i1,...,%%) € Z(k,n), 
we denote by u*i the wedge product 


us = unt A eee Atk, 


Denote also by i’ the increasing sequence of length n — & such that {i,i/} = 
{1,2,...,n}. We call i’ the complement of i. We define the permutation 0; € S;, by 
the permutation that maps the sequence (1,2,...,n) to the sequence (i,i’). Note 
that the sign of the permutation satisfies 


sign oj = Cie ipit ye 


Consider the kth wedge product uj. According to Definition C.3.2, 


(xu*l r)w =u Ar. (C.5) 
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C. Further Topics in Multilinear Algebra 


We know that {uJ} for j € I(n — k,n) forms a basis of \"~"V*. Thus, we can 
write 
xu*t = S- cus 
jeZ(n—k,n) 


for some constants cj. However, Equation (C.5) imposes that (xu*!, u*J) = 0 unless 
j=i/. We denote K = (xu*i,u*"), 
By Definition C.3.2, (xu*i, u*!)w = u*! A u** so by Proposition C.2.1, 


(xu*t ut) J/det Aut! A--- Aut” = (signa; ut! A+: Au, 


which implies that 


(tu) = sign.) VOCEA By, 


where 6; = 1 if j =i’ and equals 0 otherwise. On the other hand, 


(agi asy = S- e(u*!, us) = S- c det((A~*)yj), 


leZ(n—k,n) leZ(n—k,n) 
where by Aj; we mean the minor of A consisting of the rows 1 = (l1,...,/n—x) and 
columns j = (j1,---,jn—k). 50, we conclude that 


_ sign Oj 
a Cc det((A yi) = Opa. (C.6) 
1€Z(n—k,n) v det A 


To find the values of c; for a given i, we need to invert the matrix product 
in Equation (C.6), or more precisely, find the inverse of the (7) x ({) matrix 
det((A~+);;). Though a little tedious to show, the following formula generalizes 
the Laplace expansion formula for determinants. For any n x n matrix B, with 
notations as above, 


det B = LS (sign o;)(sign oj) det By det Byy. (C.7) 
jEL(k,n) 


A slightly stronger result gives 


S- det Bi; (sign op) (sign oj) det Bay: = 
jeZ(k,n) 


tB ifh=i 
‘°° i i, (C8) 


0 otherwise, 
for all h € Z(k,n). Now multiplying Equation (C.6) by 
(sign cn) (sign oj) det(A~") wy’, 
summing the result over j € Z(n — k,n), and taking into account 6; 4, we obtain 


sign oj 


det(A~')en — (sign on)(sign oj) det(A~*) ni, 
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which implies that 
Ch = (sign on) V det A det(A7")ni- 


From this we deduce that 


“In 


n n / 
(Ut Ae Aute) = S- oN S- GiB dewa Ejy ej att... gdkte (y*Jkt1 A... A tin), 


(C.9) 


and the proposition follows by linearity of the Hodge star operator. 


We point out that one can loosen the conditions on the bilinear form (-,-) and 
still define the Hodge star operator and obtain many of the same results. If we only 
assume that (-,-) is symmetric and nondegenerate, then all the above propositions 
hold except that one must replace det A with | det A| in Proposition C.3.5. 


PROBLEMS 


C.3.1. Let V = R* equipped with dot product and let (e1, €2,e3,e4) be the standard 
basis. Calculate 


(a) *(2e*! A e*? A e** 4 5e*? Ae* A e**); 
(b) *(17e*! A e*? — 3e*! A e*4 + 4e*? A e**). 


C.3.2. Let V = R® equipped with dot product and let (e1, €2,e3) be the standard basis. 
Let (ui, u2, ug) with coordinates with respect to the standard basis as 


2 1 1 
uw={1ltl, w= ]2), usfl 
I: 1 


Use Proposition C.3.5 to calculate: 
(a) *(4u*! — 7u*? + 5u*9); 
(b) «(2u** A u*? — 3u*! Aut). 
C.3.3. Prove Proposition C.3.1. 
C.3.4. Prove Proposition C.3.3. 


C.3.5. Let (V,(-,-)) be an inner product space. Prove that the composition * o x : 
A‘ V* > A* V* is tantamount to multiplication on \* V* by (—1)*"-»), Sup- 
pose that (-,-) is a symmetric and nondegenerate bilinear form with signature 
(p,q,0). Prove that in this case «ox : re Ve > ie V* is tantamount to multipli- 
cation on V* by (—1)*("—*)(—1)? 9, 
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